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Introduction 

The  past  decade  has  witnessed  a  growing  interest  in  contract  theories  of 
various  kinds.   This  development  is  partly  a  reaction  to  our  rather  thorough 
understanding  of  the  standard  theory  of  perfect  competition  under  complete 
markets,  but  more  importantly  to  the  resulting  realization  that  this  paradigm 
is  insufficient  to  accommodate  a  number  of  important  economic  phenomena. 
Studying  in  more  detail  the  process  of  contracting  —  particularly  its  hazards 
and  imperfections  —  is  a  natural  way  to  enrich  and  amend  the  idealized 
competitive  model  in  an  attempt  to  fit  the  evidence  better.   At  present  it  is 
the  major  alternative  to  models  of  imperfect  competition;  we  will  comment  on 
its  comparative  advantage  below. 

In  one  sense,  contracts  provide  the  foundation  for  a  large  part  of 
economic  analysis.   Any  trade  —  as  a  quid  pro  quo  —  must  be  mediated  by  some 
form  of  contract,  whether  It  be  explicit  or  implicit.   In  the  case  of  spot 
trades,  however,  where  the  two  sides  of  the  transaction  occur  almost 
simultaneously,  the  contractual  element  is  usually  down-played,  presumably 
because  it  is  regarded  as  trivial  (although  this  need  not  be  the  case  —  see 
Part  III).   In  recent  years,  economists  have  become  much  more  interested  in 
long-term  relationships  where  a  considerable  amount  of  time  may  elapse  between 
the  quid  and  the  quo.   In  these  circumstances,  a  contract  becomes  an  essential 
part  of  the  trading  relationship. 

Of  course,  long-term  contracts  are  not  new  in  economics.   Contingent 
commodity  trades  of  the  Arrow-Debreu  type  are  examples  par  excellence  of  such 
contracts.   What  does  seem  new  is  the  analysis  of  contracts  written  by  and 
covering  a  small  number  of  people.   That  is,  there  has  been  a  move  away  from 
the  impersonal  Arrow-Debreu  market  setting  where  people  make  trades  "with  the 
market",  to  a  situation  where  firm  A  and  firm  B,  or  firm  C  and  union  D,  write 
a  long-term  contract.   This  departure  is  not  without  economic  significance. 


Williamson  (1985),  in  particular,  has  stressed  the  importance  of  situations 
where  a  small  number  of  parties  make  investments  which  are  to  some  extent 
relationship-specific;  that  is,  once  made,  they  have  a  much  higher  value 
inside  the  relationship  than  outside.   Given  this  "lock-in"  effect,  each  party 
will  have  some  monopoly  power  ex-post,  although  there  may  be  plenty  of 
competition  ex-ante  before  investments  are  sunk.   Since  the  parties  cannot 
rely  on  the  market  once  their  relationship  is  underway,  the  obvious  way  for 
them  to  regulate,  and  divide  the  gains  from,  trade  is  via  a  long-term 
contract.   Until  the  advent  of  contract  theory,  economists  did  not  have  the 
tools  to  analyze  ex-ante  competitive,  ex-post  noncompetitive  relationships  of 
this  type  via  formal  models. 

Research  on  contracts  has  progressed  along  several  different  lines,  each 
with  its  own  particular  interests.   It  may  be  useful  to  begin  by  mentioning 
some  of  these  directions  before  outlining  what  subjects  our  paper  will 
concentrate  on. 

One  strand  of  the  literature  has  focused  on  the  internal  organization  of 
the  firm,  viewing  the  firm  itself  as  a  response  to  failures  in  the  price 
system.   Questions  of  interest  include  structuring  incentives  for  members  of 
the  firm,  allocating  decision  authority  and  choosing  decision  rules  to  be 
implemented  by  suitable  reward  structures.   Of  course  the  objective  is  partly 
to  gain  insight  into  organization  theory  as  such.   But  more  importantly 
perhaps,  one  is  interested  in  knowing  whether  orgemization  theory  matters  in 
the  aggregate,  i.e.,  to  what  extent  will  the  conduct  of  firms  be  different 
from  the  assumed  profit  maximizing  behavior  in  classical  theory;  and  if  it 
differs,  what  ramifications  does  that  have  for  market  outcomes  and  overall 
allocations  in  the  economy. 

Another  prominent  line  of  research  has  explored  the  workings  of  the 
labor  market.   A  plausible  hypothesis  is  that  contingent  claims  for  labor 


services  are  limited  for  reasons  of  opportunism.   This  invites  innovation  of 
other  types  of  contracts  that  can  be  used  as  substitutes.   The  research  has 
centered  on  the  structure  of  optimal  bilateral  labor  contracts  under  various 
assumptions  about  enforcement  opportunities,  the  properties  contractual 
equilibria  will  have  and  in  particular  whether  these  equilibria  will  exhibit 
the  commonly  claimed  inefficiencies  associated  with  real  world  adjustments  in 
employment. 

Inspired  by  the  possibility  that  long-term  contracts  may  embody  price 
and  wage  sluggishness,  a  related  body  of  work  has  explored  their  macroeconomic 
implications  (see  e.g.,  Fischer  (1977)  and  Taylor  (1980)).   Unlike  most 
contract  analysis  this  literature  has  taken  the  form  of  contracts  as  given, 
typically  with  nominal  wage  and  price  rigidities.   This  is  not  as  satisfactory 
as  working  from  first  principles,  but  it  has  made  policy  analysis  quite 
tractable. 

Financial  markets  offer  another  arena  of  substantial  potential  for 
contract  theoretic  studies  that  is  beginning  to  be  recognized.   The  importance 
of  limited  contracting  for  the  emergence  of  financial  services  and 
institutions  has  been  suggested  in  papers  such  as  D.  Diamond  (1984),  Gale  and 
Hellwig  (1985)  and  Townsend  (1980).   This  line  of  research  also  offers 
prospects  for  a  careful  modelling  of  the  role  of  money  and  the  conduct  of 
monetary  policy  (see  Townsend  (1985)  and  D.  Diamond  (1985)). 

As  the  field  is  progressing,  it  becomes  harder  to  place  models  in 
specific  categories.   Initially,  models  of  organizational  design  ignored 
market  forces,  or  at  least  treated  them  in  a  very  primitive  fashion.   In 
contrast,  the.  theory  of  labor  contracts  started  out  without  consideration  for 
organizational  incentives.   More  recent  models,  however,  treat  both  incentive 
and  market  issues  concurrently.   Such  crossbreeding  is  fruitful,  but  it  makes 
the  task  of  organizing  the  subjects  of  a  paper  like  this  much  harder.   Since 
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we  have  been  unable  to  come  up  with  a  natural  classification  that  would  avoid 
this  problem,  we  will  stick  to  an  outline  that  follows  the  historical  progress 
rather  closely. 

We  begin  in  Part  I  with  agency  theory  as  a  representative  paradigm  for 
the  organization  theoretic  aspects  of  contracting.   From  there  we  go  on  to 

labor  contracting  (Part  II).   Finally,  we  turn  to  incomplete  contracts  and  the 
aforementioned  "lock-in"  effects  (Part  III).   This  work  which  represents  more 
recent  methodological  trends  in  contract  research  has  not  advanced  very  far 
yet,  and  our  discussion  here  will  be  correspondingly  more  tentative  in  nature. 

Needless  to  say,  we  will  not  attempt  a  comprehensive  survey  of  the  large 
number  of  contractual  models  that  have  appeared  to  date.   Some  subjects,  for 
instance  models  relating  contracts  to  macroeconomic  policy,  are  entirely  left 
out.   So  are  models  of  financial  contracting.   Our  intention  has  been  to  be 
selective  and  critical  rather  than  comprehensive.   While  we  allow  ourselves  a 
rather  opinionated  tone,  we  hope  that  the  paper  still  gives  a  good  idea  of  the 
general  nature  of  the  ongoing  research  and  a  reasonably  fair  assessment  of  its 
main  contributions. 

Despite  the  selective  approach  the  paper  has  grown  very  long.   In  order 
for  it  to  be  more  readily  digestable  we  have  set  it  up  so  that  the  three  parts 
can  be  read  essentially  independently.   Each  part  has  a  concluding  section 
that  sums  up  its  major  points. 

A  Word  About  Methodology 

Most  contract  theories  are  based  on  the  assumption  that  the  parties  at 
some  initial  date  (zero  say)  design  a  Pareto  optimal  (for  them)  long-term 
contract.   Optimality  is  not  to  be  understood  in  a  first-best  sense,  but 
rather  in  a  constrained  or  second-best  sense.   Indeed,  informational  and  other 


restrictions,  which  force  the  contract  to  be  second-best,  are  at  the  heart  of 
the  analysis  —  without  them  one  would  quickly  be  back  in  the  standard  Arrow- 
Debreu  paradigm  where  contractual  form  is  inessential.   Since  informational 
constraints  will  play  a  particularly  important  role  in  the  ensuing  discussion, 
let  us  note  right  away  that  we  will  throughout  restrict  attention  to  cases  in 
which  informational  asymmetries  arise  only  subsequent  to  contracting.   In  the 
typical  language  of  the  literature,  we  will  not  consider  adverse  selection 
models. 

The  design  of  a  Pareto  optimal  contract  proceeds  by  maximizing  one 
party's  expected  utility  subject  to  the  other  party  (or  parties)  receiving  a 
minimum  (reservation)  expected  utility  level.   Which  party's  utility  level  is 
taken  as  a  constraint  does  not  matter  usually,  because  most  analyses  are 
partial  equilibrium.   When  there  is  perfect  competition  ex  ante,  this 
reservation  utility  can  be  interpreted  as  that  party's  date  zero  opportunity 
cost  determined  in  the  date  zero  market  for  contracts .   When  ex  ante 
competition  is  imperfect,  the  parties  will  presumably  bargain  over  the  ex  ante 
surplus  from  the  relationship  and  so  the  reservation  expected  utility  levels 
become  endogenous . 

The  literature  has  often  been  cavalier  about  the  determinants  of  the 
reservation  utility,  because  valuable  insights  have  emerged  from  the  general 
characteristics  of  Pareto  optimality  alone.   On  the  other  hand,  the  fact  that 
market  forces  reduce  to  simple  constraints  on  expected  utilities  greatly 
facilitates  equilibrium  analysis.   Equilibration  in  expected  utilities  is 
usually  trivial.   This  gives  the  contractual  approach  its  main  methodological 
advantage  relative  to  models  of  imperfect  competition,  for  instance.   The 
analytical  core  of  contract  theory  is  an  optimization  problem,  while  in 
imperfect  competition  it  is  an  equilibrium  problem.   Methods  for  solving 
optimization  exercises  are  substantially  more  advanced  than  methods  for 
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solving  equilibrium  problems. 

Of  course,  substituting  an  optimization  analysis  for  an  equilibrium 
analysis  is  not  always  economically  meaningful  {for  instance,  we  are  not 
implying  that  imperfect  competition  should  be  studied  in  this  way).   Indeed, 
the  economic  credibility  of  the  contractual  approach  may  be  called  into 
question  when,  as  often  happens,  optimal  contracts  become  monstrous  state- 
contingent  prescriptions.   How  are  such  contracts  written  and  enforced? 

Three  responses  to  this  question  can  be  offered.   The  first  one  is  to 
appeal  to  the  powers  of  the  judicial  system  and  its  ability  to  enforce  certain 
explicitly  agreed  upon  contractual  terms.   The  assumption  is  that  sufficient 
penalties,  either  pecuniary  or  non-pecuniary,  will  be  imposed  for  breach  and 
hence  rational  parties  will  not  breach.   This  assumption  makes  a  model 
internally  consistent,  but  is  unsatisfactory  on  two  accounts.   It  maintains  an 
artificial  dichotomy  between  those  contractual  provisions  that  are  assumed  to 
be  infinitely  costly  to  enforce  and  those  that  are  assumed  to  be  completely 
costless  to  enforce.   Also,  it  often  predicts  (by  assumption)  explicit  terms 
that  are  much  more  complex  than  those  we  observe  and  in  that  sense  is  no 
answer  to  what  prodded  the  enforcement  question  above. 

The  second  response  is  a  pragmatic  one:   one  could  argue  that 
qualitative  and  aggregate  features,  rather  than  contractual  detail,  are  the 
relevant  ones  for  judging  the  success  of  a  model.   In  support  of  this  view  one 
can  allude  to  the  implicit  nature  of  contracts  in  the  real  world;  in  other 
words,  suggest  that  equilibrium  outcomes  in  the  real  world  mimic  optimal, 
complex  state-contingent  contracts,  despite  the  relative  simplicity  of  the 
explicit  agreements  we  observe.   The  difficulty  with  this  response  is  that  we 
do  not  understand  well  how  implicit  contracts  of  this  type  are  sustained  as 
equilibrium  phenomena. 

Ideally,  one  would  like  to  know  what  determines  the  division  between 


explicit  and  implicit  enforcement  of  a  contract.   This  leads  to  the  third 
approach,  which  is  to  confront  the  enforcement  issue  explicitly  by  including 
realistic  legal  penalties  for  breach  as  well  as  indirect  costs  that  affect 
equilibrium  behavior,  for  instance  through  reputational  concerns.   While  much 
of  the  extant  literature  rests  on  a  combination  of  the  first  two  responses  to 
the  enforcement  issue,  the  present  trend  is  towards  this  more  ambitious,  but 
also  more  satisfactory,  third  approach.   This  will  be  discussed  at  some  length 
in  the  last  part  of  the  paper. 
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Part  I;   Agency  Models. 

1.1   Introduction. 

Agency  relationships  are  ubiquitous  in  economic  life.  Wherever  there 
are  gains  to  specialization  there  is  likely  to  arise  a  relationship  in  which 
agents  act  on  behalf  of  a  principal,  because  of  comparative  advantage. 
Examples  abound:   workers  supplying  labor  to  a  firm,  managers  acting  on 
behalf  of  owners,  doctors  serving  patients,  lawyers  advising  clients.  The 
economic  value  of  decision-making  made  on  behalf  of  someone  else  would 
easily  seem  to  match  the  value  of  individual  consumption  decisions.   In  this 
light  the  attention  paid  to  agency  problems  has  been  relatively  slight. 
Moreover,  there  are  some  less  obvious  instances  of  the  same  formal  agency 
structure:   the  government  taxing  its  citizens,  the  monopolist  priced- 
discriminating  custcxners,  the  regulator  controlling  firms,  all  of  which  are 
substantial  problems  in  their  own  right. 

If  agents  could  costlessly  be  induced  to  internalize  the  principal's 
objectives,  there  would  be  little  reason  to  study  agency.   Things  become 
interesting  only  when  objectives  cannot  be  automatically  aligned.   So  what 
is  it  that  prevents  inexpensive  alignment?  The  most  plausible  and  commonly 
offered  reason  is  asymmetric  information,  which  of  course  ties  closely  to 
the  source  of  agency:   returns  to  specialization.  The  sincerity  of  a 
worker's  labor  input  is  often  hard  to  verify^  leading  to  problems  with 
shirking.   Informational  expertise  permits  managers  to  pursue  goals  of  their 
own  such  as  enhanced  social  statiis  or  improved  career  opportunities. 
Private  information  about  individual  characteristics  causes  problems  for  the 
government  in  collecting  taxes. 


Thus,  underlying  each  agency  model  is  an  Incentive  problem  caused  by 
some  form  of  asymmetric  information.   It  is  common  to  distinguish  models 
based  on  the  partictilar  information  asymmetry  Involved.   We  will  use  the 
following  taxonomy.   All  models  in  which  the  agent  has  precontractual 
information  we  place  under  the  heading  of  adverse  selection.   Except  for  an 
occasional  reference,  we  will  not  deal  at  all  with  this  category.   Our 
models  will  assume  symmetric  information  before  contracting.   Within  this 
category,  which  we  will  refer  to  as  moral  hazard  models,  a  further 
distinction  is  useful:   the  case  where  the  agent  takes  unobservable  actions 
and  the  case  where  his  actions  may  be  observed,  but  not  the  contingencies 
under  which  his  actions  were  taken.   Arrow  (1985)  has  recently  suggested  the 
informative  names:  Hidden  Action  Model  and  Hidden  Information  Model  for 
these  two  subcategories.   The  worker  supplying  unobservable  effort  is  the 
prototypical  hidden  action  case,  while  the  expert*-manager  making  observable 
investment  decisions  leads  to  a  typical  hidden  information  model. 

As  will  become  clear  shortly,  the  hidden  action  case  formally  subsumes 
the  hidden  information  case.   (This  rationalizes  our  use  of  moral  hazard  as 
a  joint  label.)   Nevertheless,  it  is  meaningful  to  keep  the  two  distinct, 
because  they  differ  in  their  economic  implications  as  well  as  in  their 
solution  techniques.   In  this  part  we  will  focus  on  the  hidden  action  case. 
The  next  part  on  labor  contracting  will  illustrate  the  hidden  information 
case. 

The  general  objective  of  an  agency  analysis  is  to  characterize  the 
optimal  organizational  response  to  the  incentive  problem.  Typically,  the 
analysis  delivers  a  second-best  reward  structure  for  the  agent,  based  on 
information  that  can  be  included  in  the  contract.   Characterizing  the 
optimal  incentive  scheme  is  important  but  not  the  prime  economic  purpose. 
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What   is  more  interesting  is   the   allocational   distortions   that   come  with  the 
incentive  solution.     While  one  often   could  design   incentive  schemes   that 
induce  the   agent   to  behave   in  the  same  way  that  he    would     if  no   information 
asymmetry  were  present,    that   is   rarely  second-best.      Instead,    some  of   the 
costs   of   the  information  asymmetry  are  born  by  distortions   in  decision 
rules,   task   assignments,    ^nd   other   sostly   institutional   arrangements. 
This   is  what   gives    the   theory  its  main  economic   content. 

The   agency  paradigm  has   indeed  been  quite  successful    in  shedding  light 
on  institutional   phenomena  that   are  beyond  received  microeconomic  theory. 
The  second-best  nature  of   "incentive  efficient"   solutions   admits   a  host  of 
arrangements   that  would  be   inexplicable  if   information  flows  were  costless. 
Examples   abound  in  the  literature  and  we  could  easily  use  up  our  alotted 
space  by  describing  some  of   them.      However,   we  have   chosen  not  to  follow 
this  line,   but  rather  to  be  more  methodologically  oriented.      Agency  models 
are  not  without   problems   and  this   is  best  brought  home  by  going  into  the 
details   of   a  generic  structure. 

We  will   begin  with  three   different   formulations   of   the   agency   problem, 
each  with  its   own  merits.      Next   we   go  on  to  discuss   a  simple  version  of 
hidden  action,   which  will   suffice  to  sum   up  the  main  insights   of   that   type 
of  model.      An  economic  assessment   and   critique  follows,   which   in  turn  leads 
us   to  a  discussion  of   recent   improvement   efforts.      These   include  the  role  of 
robustness   in  simplifying   incentive  schemes    and  the  use  of   dynamic  models   to 
arrive   at   richer   predictions.      The   last  section  provides   a  summary  of  what 
agency  theory  in  our   view  has   to  offer   and  what  its   shortcomings   are. 
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1.2     Three  Formulations. 

Let   A  be   the  set   of   actions   available  to  the   agent   and   denote  a  generic 
element  of   A  by  a.      Let   9  represent    a  state  of  nature  drawn  from   a 
distribution  G.      The  agent's   action  and  the  state  of  nature  jointly 
determine  a  verifiable  outcome  x   -  x(a,e)   as   well   as   a  monetary  payoff   ir   ■= 
•ir(a,9).      The   verifiable  outcome  x  can   be   a  vector   and  may  include   ir.      The 
monetary  payoff  belongs   to  the  principal.      His  problem  is  to  construct   a 
reward  scheme  s(x),   which  takes   outcomes   into  payments   for   the   agent. 

The  principal   values  money  according  to  the  utility  function  v(m)   and 
the   agent   according  to  the  utility  function  u(m).      The  agent  also  Incurs   a 
cost  from  taking  the   action  a,   which  we  denote  c(a).     We  assume  initially 
that  the  agent's   cost  of  action  is  independent  of  his  wealth,   i.e.,   that  his 
total   utility  is  u(s(x))   *-  c(a).      The   principal's  total   utility  is  v(it- 
s(x)). 

The  agent   and  the   principal   agree  on  the  distribution  G,   the  technology 
x(',«)   and  the  utility  and  cost  functions. 

This   is   the  state-space   formulation  of   the   agency  problem   as    initiated 
by  Wilson    (1969),    Spence   and  Zeckhauser    (1971)    and  Ross    (1973).      Its  main 
advantage  is   that  the  technology  is  presented  in  what   appear  to  be  the  most 
natural   terms.      Economically,   however,    it  does   not  lead  to  a  very 
informative   solution. 

There  is   another,    equivalent   way  of   looking  at  the   above  problem,   which 
yields  more   economic  insights.      By   the   choice   of   a,    the   agent   effectively 
chooses   a  distribution  over   x  and  ir,    which   can  be   derived  from  G   via  the 
technology  x(',«).      Let   us    denote  the   derived  distribution  F(iT,x;a)      and   its 
density    (or  mass   function)      f(Tr,x;a).      This   parametrized  distribution 
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formulation  was  pioneered  by  Mirrlees  (197^,  1976)  and  further  explored  in 
Holmstrom  (1979).   For  later  reference,  let  us  state  the  principal's  problem 
mathematically  in  para-metrized  distribution  terras.   His  problem  is  to: 

(1.1)  Max  /v(iT  -  3(x)  )f  (•ir,x:a)dx,   over  a  e  A,  s(')  e  S,  s.t. 

(1.2)  /u(s(x))f (Tr,x;a)dx  -  c(a)  >_u, 

(1.3)  /u(s(x))f  (Tr,x;a)dx  -  c(a)  >_   /u(s(x)  )f  (tt.x;  a' )dx  -  c(a' )  ,  Va'cA. 
In  this  program  the  principal  is  seen  as  deciding  on  the  action  he 

wants  the  agent  to  implement  and  picking  the  least  cost  incentive  scheme 
that  goes  along  with  that  action.   It  is  worth  noting  that  since  the 
principal  knows  the  agent  (his  preferences),  he  also  knows  what  action  the 
agent  will  take  even  though  he  cannot  directly  observe  it.   Constraint  (1.3) 
assures  that  the  incentive  scheme  is  consistent  with  the  action  the  agent 
will  pick  when  he  maximizes  his  expected  utility,  while  constraint  (1.2) 

assures  the  agent  a  minimum  expected  utility  level   u  ,  presumably 
determined  in  the  market  place. 

A   solution  to  the   principal's   program   is  not   automatically  assured;    in 
fact   simple   examples   can   be   given   in  which  no  optimal   solution  exists.      We 
will    encounter   a  non-existence   example  shortly,    but   otherwise  we  merely 

assume  a  solution  exists. 

The  third,  most  abstract,  formulation  is  the  following.   Since  the 
agent  in  effect  chooses  among  alternative  distributions,  one  is  naturally 
led  to  take  the  distributions  themselves  as  the  actions,  dropping  the 
reference  to  a.   Let  p  denote  a  chosen  density  (or  mass)  function  over  it  and 
X  and  let  P  be  the  set  of  feasible  densities  from  which  the  agent  can 
choose.   Since  the  agent  can  randomize  among  actions,  P  can  be  assumed 
convex.   In  the  case  (•ir,x)  takes  on  a  finite  number  of  values,  P  is  a 
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simplex.      The   cost  function  in  this  case  is  written  as  C(p),   which  also  will 
be   convex  because  of   randomization. 

Of  course,    the  economic   interpretation  of   the   agent's   action  and  the 
incurred  cost   is   obscured  in  this   general   distribution  formulation,   but   in 
return  one   gets    a  very  streamlined  model   of   particular   use  in  understanding 
the  formal   structure  of   the   problem. 

This  way  of  looking  at   the   principal's   problem   is   also  very  general - 
It   cover  situations   in  which  the   agent  may  observe  some  information  about 
the   cost   of   his   actions   or  the   expected  returns   from  his  actions,    before 
actually  deciding  what  to  do;    in  other  words   cases   of  hidden  information. 
To  see  this,   simply  note  that  whatever  strategy  the   agent  uses  for   choosing 
actions   contingent  on  information  he  observes,   the  strategy  will   in  reduced 
form  map  into  a  distribution  choice  over    (ir,x).      Thus,  ex  ante  strategic 
choices   are  equivalent  to  distribution  choices   in  sane  P   (properly 
restricted,   of   course).      Note  also  that  the  primitive  cost  function  for 
actions,   c(a),   could  be  stochastic  without   affecting  the  general 
formulation.      Taking  expectations   over   costs   c(a)   would  still   translate  into 
a  cost  function  C(p),    because  the   agent's   utility  function  is  separable. 

1.3     The   Basic  Hidden  Action  Model. 


Much  of   the  general    insights   obtained  from   studying  hidden   action 
models   can  be   conveyed  in  the  simplest  setting  where  the   agent   hats   only  two 
actions   to  choose  from.      For   concreteness ,   let   us   identify  them  with  working 
hard,    H,    and  being  lazy,    L.      Also,    cissume  for   the  moment  that   x  coincides 
with  the  monetary  payoff   to  the   principal   and  that   the   principal    is   risk- 
neutral.      If   the   agent   works    hard,    the   distribution  over   x  is   f„(x),   while 
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if   he   is  lazy,   the   distribution  is  f, (x).      In  view  of   this  language  it  is 
natural   to  assume  that  f   dominates   f,    in  a  flrst~order  stochastic  dominance 

n  L 

sense,   i.e.,   that  the   cmiulative   distribution  functions   satisfy  F   (x)   < 

ii 

F    (x),    for   all   x,   and  that  the   cost   of   hard  work   c„,    is   greater   than  the 

L  H 

cost  of   being  lazy,   c    . 

Substituting  these  simplifying  assumptions   into    (1.1)    '-    (1.3)    gives    a 
straightforward  program   that   can  be  easily  solved.      First,   note  that   if   the 
principal   wants   to  implement   L,   then  he  should  pay  the   agent   a  constant, 
because  that   yields   optimal    risk-sharing.      The   problem   therefore  assumes 
interest  only  if   the  principal   wishes   to  implement  H,   because  now  some  risk- 
sharing  benefits   have  to  be  sacrificed  in  order  to  provide  the   agent  with 
the  right   incentives.      Letting  A   and  y  be  the  Lagrangian  multipliers   for 
constraints    (1.2)   and   (1.3)   respectively,  we  see  that  the  optimal   sharing 
rule  has    to  satisfy: 


(1.^)      1/u'(s(x))    -  X    +  u[1    -   f,  (x)/f„(x)],    for   a.e.   x. 

L  H  ... 

This   is   a  particular   version  of   Mirrlees'    (197^,    1975)    formula, 
analyzed  and  interpreted  further   in  Holmstrom    (1979).      Let   us    discuss   its 
revecLling  message. 

First,   note  that   if    u   -  0,    then  we  have  first-best   risk  sharing    (3(x) 
constant)   and  the   agent  picks   L   in  violation  of   the  incentive   constraint. 
Therefore,    u>0.      With   u  positive,   s(x)   will   vary  with  the  outcome  x,   trading 
off   some  risk  sharing  benefits   for   incentive   provision;    more  precisely,    it 

will   vary  with  the   likelihood  ratio  f, (x)/f„(x).      To  understand  why,   a  few 

L  H 

words   on  the   likelihood  ratio  are  in  order. 
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The  likelihood  ratio  is   a  concept   familiar  from  statistical    inference. 
It  reflects   how  strongly  x  signals   that   the  true  distribution  from  which  the 
sample  was    drawn  is  f,    rather  than  f„.      A  high  likelihood  ratio  speaks   for   L 

L"  n  , 

and  a  low  for  H;    a  value  of   one  is   the   intermediate  case  in  which  nothing 
new  is  learned  from   the  sample,    because  it   could  equally  well   have   come  from 
either  of   the  two  distributions. 

The  agency  problem  is  not   an  inference  problem   in  a  strict  statistical 
sense;    conceptually,   the  principal   is  not   inferring  ainything  about  the 
agent's  action  from  x,   because  he   already  knows   what   action  is  being 
implemented.     Yet,   the  optimal   sharing  rule  reflects   precisely  the 
principles   of   inference.      This   can  be  seen  even  more  transparently  by 
rewriting   (1.^1)    formsilly  in  terms  of   a  "posterior  distribution"   derived  from 
updating  a  "prior"   on  H.      Let  the  prior   be   Y  ( =  probability  of  H)   and  denote 

the   posterior     T'(x).      Then     Y'(x)    -  Y  f    (x)/f    (x)     by   Bayes'    rule  and  we 

n  L 

have: 

(1.^')  1/u'(s(x))    -    X  +   u{(Y'(x)    -    Y)/Y'(x)(1    -    Y)}    . 

From  C1.4')we  see  that  the  agent   is   punished  for   outcomes    that   "revise 

beliefs"  about     H      down,   while  he   is   rewarded  for   outcomes   that   "revise 

beliefs"    up.      Moreover,   the  sharing  rule   is   a  function  of      x     only  through 

« 

the   posterior   assessment      Y'(x);    outcomes   that   lead  to  the  same  posterior 
imply  the  same  reward.      As   in  statistical   decision  theory,   the   posterior   is 
a  sufficient  statistic  about  the  experimental   outcome. 

The  fact  that  we  can  interpret  the  optimal  sharing  rule  in  standard 
statistical  terras  is  important.  It  is  intuitively  appealing  and  it  will 
yield  some  interesting  predictions.  At  the  sa-ne  time  it  will  reveal  the 
main  weakness  of  the  model:  as  we  will  see,  very  few  restrictions  can  be 
placed  on  the  shape  of   the  sharing  rule. 
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To  begin  with,   consider  the   issue  of  monotonicity ,   which  one  can  say- 
something  about.      One  might   think  that     s(x)      should  always    be   increasing  in 
X     given  that      f„      stochastically  dominates      f,  .      Somewhat   surprisingly  that 

is  not  true  in  generail-  The  reason  is  that  higher  output  need  not  always 
signal  higher  effort  despite  stochastic  dominance.  For  instance,  suppose 
f    (x)    =  f    (x+1 )      and     f^  (x)      is  not   unimodal    (say,    it   has    two  humps).      Then 

there  will    exist  two  values    of      x     such  that   the   higher   one  has    a  larger 
likelihood  ratio     f,  (x)/f    (x)     than  the  smaller   one,    implying  that  the 

larger   outcome  would  speak  more  strongly  for   a  low   choice   by  the   agent  than 
the  smaller   outcome.      Just   as  statistical    intuition  would  suggest,  we  should 
pay  the  agent  less   in  the.  high  outcome  state.     However,   to  the  extent  one 
thinks   this  is  not   descriptive  of   the  economic  situation  considered,   one  can 
add  the  assumption  that  the  likelihood  ratio  is  monotone  in     x.      Since,   from 
(1.^),   the  sharing  rule  is  monotone  in  the  likelihood  ratio,   this  assumption 
will   assure  a  monotone  sharing  rule.      Not  surprisingly,   the  Monotone 
Likelihood  Ratio  Property    (MLRP)    is    a  well    known   concept  from  statistics. 
It  was    introduced  into  economics    by  Milgrom    (1981),   who  suggested  the 
discussion  above. 

What    about   other   questions    concerning  the  shape  of     s(x)?      For   instance 
(anticipating  an  upcoming  discussion)   are  there  natural   restrictions   on  the 
model   that   yield  linear  sharing  rules?     The   answer   is  No.     The   problem  is 
that   the   connection  between      x     as    physical    output   and  as   statistical 
information  is   very  tenuous.      In  fact,   the   physical    properties    of      x     are 
rather   irrelevant   for  the  solution;    all   that  matters  is  the  distribution  of 
the   "posterior"    (or   likelihood  ratio)   as    a  function  of   the   agent's   action. 
To  highlight   the   problem,   note  that     x     would  not   even  have  to  be  a  cardinal 


meaaure  for   its   Information  content  to  be  the  same.      Since  it  is  the 
information  content   of     x     that   determines   the  shape  of   the  optimal 
incentive  scheme,   it  is  hard  to  come  up  with  natural   economic  assumptions 
that   connect  the  agent's  reward  in  any  particular  way  to  the   physical 
measures  of     x. 

There  are  cases   for  which  linear  rules   are  optimal;    in  fact,   almost 
any  shape  of     s(x)      is   consistent   with  optimality,   because  output   can  be 
endowed  with  rather   arbitrary  information  content.     To  illustrate  this, 
suppose  we  want    an  optimal   rule  that   is  linear   between   0  and   TOO.      Start 
with  any  example  with  two  actions,  MLRP  and  a  continuous   outcome  space.      As 
argued  above,   the  optimal   sharing  rule  will   be  monotone  for  such  an  example; 
call   it     s*(x).      Now  transform  the  example,   by  letting  output   be     x'    •= 
ci3*(x)    +  B,   where     a     and     B     are  constants   to  be  detK-mined.     Since  this 
transformation  is  monotone,   the  information  content  of     x*      is  the  same  as 
the  information  content  of     x.      It  follows   that  the  optimal  way  of 

implementing     H     in  this  revised  example  is   to  pay  the   agent     3(x')    «  a     x' 
a~   3  ■■•  which  is   a  linear  function  of   the  output      x' .      With     s(x')      the   agent 
is   paid     s*(x)      whenever      x'      corresponds    to     x,   since  we  know  this   is   the 
cheapest  way  of    implementing     H.      Or  put   in  statistical   terms,    this   scheme 
pays   the   agent  the  same  function  of   the   "posterior"   as   the  optimal   scheme  in 
the  initial   example.     The  role  of     a  and     6     is   to  assure  that  the  range 
requirement   can   be  met   and  that     H   remains    the  optimal   action  to  implement 
in  the  transformed  example. 

The  same  idea  can  be   used  to  prove  the  optimality  of  other  shapes   as 
well.      Some  very  weak  restrictions    apply.      For   instance,   as   proved  in 
Grossman   and  Hart    (1983),      s(x)      cannot    be   decreasing  everywhere  and  on 
average     s(x)      cannot   be   increasing  too  rapidly  either.      More  generally,   one 
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can  show  that     3(x)     has   to  satisfy     0  <   /a'(x)   f„(x)   dx  <    1 ,   but  that   is 

ri 
about   all.      This   inability  to  place  natural   restrictions   on  the  model    that 
yield   commonly  observed  sharing  rules   should   be    contrasted  with  the   theory 
of   risk   sharing  in  which  linear  schemes,   for   instance,   arise  from  simple 
restrictions   on  preferences   alone. 

While  the  model   puts  few   constraints   on  the  sharing  rule,    it  yields 
very  sharp  predictions   about   the  measures    that   should  enter  the   contract   in 
the  first  place.      To  illustrate  this,   suppose  initially  that     x   =  tt     and 
next   introduce  some  other  source  of   information,     y,   that   could  potentially 
be  used  in  the   contract.      This   could  be   information  about  the   general 
economic   conditions   under  which  the   agent  operates,   it  could  be  indirect 
evidence  from  the  performance  of   agents   in  stochastically  related 
technologies   or   it  could  be  direct  monitoring  of  his  performance.     When 
would  it  be  the   case  that     y     is  valuable  in  the  sense  that  a  contract   based 
on  the   vector     x   «   (ii,   y)      Pareto  dominates   all   contracts   based  on     it 
alone? 

The   answer   is   evident  from   our   earlier   discussion  and  equation    (^). 
The   additional   signal      y     will   necessarily  enter   an  optimal    contract   if   and 
only  if    it   affects   the   posterior   assessment   of   what  the   agent   did;    or 
perhaps  more  accurately,    if   and  only  if      y     influences    the  likelihood  ratio. 
Reversely,      s(x)     will   not   depend  on     y,   precisely  when 
(1,5)  f,  (x)/f    (x)    -  h(ir),   almost   everywhere    . 

If    (1.5)    is   true,      y     will   be  worthless,    but   if    (1.5)    is   false     y     will   have 
some  strictly  positive   value,    because     s(x)      will    depend  on  it.      This 
necessary  and  sufficient    condition   can   be   translated  into  a  more  familiar 
form: 
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(1.5*)  f^(x)    -   A(x)    B^Cir),    a.e.,    1    -    L.H    . 

In  tills  form   the   condition  says   that      ir      is   a  sufficient  statistic  for     x   - 
(u,    y).      Thus,   we   have   the   simple    but   strong  result   that      y      is   valuable  if 
and  only  if    it   contains   some  information  about   the    agent's   action  that   is 
not   already  in     ir   (Holmstrom    (1979,    1982a)   and  Shavell    (1979)). 

This   sufficient   statistic   condition  underlines    again  the    close   analogy 
between  the  strategic   principal-agent    game   and   classical    statistical 
decision  theory,   which   describes    a  game   against   nature.      Blacl<well's 
celebrated  result,   which   states    that    optimal    single-person   decision  rules 
can   be   based  on  sufficient   statistics    alone,    is    very  similar.      Some 
differences   should   be  noted,    however.      First,   while    (1.5')    says    that 
randomization  has   no   value    (just   as    Blackwell's   theorem),   this   conclusion 
depends   on  the  separable  form  of   the   agent's   utility  function  as   Gjesdal 
(1982)   has   shown.      (Of   cotirse,   this  randomization   could  be   carried  out 
without   an   exogenoiis    costly  signal ,   so  in  this   sense  it  still  remains   true 
that     y     has   no   value   if    (1.5')   holds.)      Second,   the  fact  that   any  signal 
with  some  information   about  the   agent's   action  has   strictly  positive   value, 
has   no    counterpart,    in   Blackwell's    theorems. 

An   alcemarlve  way   of   esxjressing  rhe   sufficienr   srariscic   condlrion     is 
to    say   rhar   it   partially   orders   various    information   systems    (see   Grossman 
and  Hart    (1983)    and   Gjesdal    (1982)).       If  x   and  x'    are   two    different   infor- 
mation  signals    (possibly  vectors),    which   can  be   ordered  by   Blackwell's   notion 
of   inf ormativeness ,    say   so    that   s   is   more   informative   than  x' ,    then   it    is 
true   that   x'    is   not   preferred   to   x.       In   fact,    if   the   ordering   is    strict   then 
X   is    strictly   preferred   to   x'    in   "almost   all"    agency   problems.      The   qualifier 
"almost    all"    is   needed   to    take   care   of   exceptional   situations   in  which  x'    is 
equal   to    the   optimal   sharing   rule    s (x)    for   a  particular  problem,    which   of 
course    is    as    much    information    as    one   would    ever  want    from  x.       We    leave    the 
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qualifier   deliberately  vague   to   avoid   getting   too   far  away   from  our  main 
course. 

The  sufficient  statistic  result  gives   the  model   its  main  predictive 

2 
content   as  we  will   indicate  shortly. 

I.^     The  General   Case 

Let   us    consider   briefly  what   happens   when  one  moves   beyond  the  two 
action  case  studied  above.      Economically,   not  much  new  will   come  out,   but   it 
is  worth  understanding  why. 

Consider  the   common  case  where  the   agent's  action  is  a  continuous,  one- 
dimensional   effort   variable.      The   agent's   incentive   constraint    (1.3)    is  in 
this   case  problematic  and  it  has   been  standard  to  replace  it  by  the  more 
manageable  first-order   condition: 
(1.5)  /u(s(x))   f    (x;a)    dx  -   c'(a)    =  0    . 

Relaxing    (1.3)    in  this  way  is   referred  to   as    the   "first-order   approach"    in 

the  literature.      It  is   easy  to  proceed  to  a  cheiracterization  of   the  optimal 

scheme,   provided  the  relaxation  embedded  in    (1.5)    is   appropriate.      The 

result   is   as   follows: 

(1.7)  v'(x  -   s(x))/u' (s(x))    -  A    +  y   f    (x ; a)/f (x; a) ,      for      a.e.   x, 

a  ... 

Here     f   /f     is   the   continuous   counterpart   of   the  likelihood  ratio.      It  is 
a 

increasing  when  MLRP  holds.      Thus,   when  this   characterization  is   correct,   we 

get   the  same   qualitative   insights   as   from   the  simple  two-action  case  above, 

including  the  sufficient  statistics   results. 

Unfortunately,   the   "first-order    approach"    does   not   always   work,    in  the 

sense  that    it  will   sometimes   pick  out   a  scheme  that   in  the  end  does   not 


satisfy  the   global   incentive   constraint    (1.3)   even  though  it  does  satisfy 
the  first-order   condition   (1.6).     Mirrlees    (1975)   was   the  first  to  recognize 
the   dilemma.      Subsequently,    Grossman   and  Hart    (1983)    and  Rogerson   (1985b) 
worked  out   conditions   that   ensure  the   validity  of   the  first-order   approach. 
It  is   of   some  interest   to  understand  the  resolution,   because  the   issue   has 
received  considerable  attention. 

First,    consider   a  simple   extension  of   the  two-action  case.      Assume  the 
agent   controls   the  following  family  of   distributions: 

(1.8)  f(x;a)    =  afj^(x)    +   (1    ♦-  a)   f^U).      a  e^'^H    * 

In  other  words,   the   agent   determines    by  his   effort   a  convex  combination  of 
two  fixed  distributions.      This  was   called  the  Spanning  Condition  by  Grossman 
and  Hart;    we  will   refer  to  it   as   the  Linear  Distribution  Function  Condition 
(LDFC).      Note  that   by  randomizing  in  the  two-action  model   the  agent  has 
access   to  the  family  described  by    (1.8). 

With  LDFC   it  is  evident  that   the   "first-order   approach"    is   valid.      The 
reason  Is   that   no  matter  what  schedule  the  principal   offers   to  the   agent, 
the  first-order   condition  will   coincide  with  the   agent's  global   incentive 
constraint    (for   a  fixed  action),   since  the  integral    in   (1.3)    is  linear   in 
a. 

When  we  treat  the   general   case  using  the  "first**order   approach"   we  are 
effectively  taking  a  linear   approximation  of   the  true  family  of 
distributions     f(x;a)      around  the   particular   action,      a*   say,      that  the 
principal   wants   to  implement;    in  other  words   we  are  treating  the   problem   as 
if   the   agent  were  choosing  from  the  hypothetical   family: 

(1.9)  f'(x;a)    -   f(x;a*)    +  af^(x;a*),    a  small    , 

a. 

using  a   cost  function 
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c(a)    •=  c(a*)    +  ac'(a*)    . 
The  faniily  in    (1.9)    is  linear   in  the   same  sense  as   LDFC  and  there  is   no 
problem  in  getting  a  proper   characterization.      (Note  that  since     /f     -=  0    , 

f    is  a  legitimate  distribution  for  small     a,   provided     T        is  bounded.) 

a 

However,    it  may   be  that   once  we  have   gotten  the   agent   to  choose     a   -  0    (i.e. 

choose  the  desired     a*)    among  the   distributions   in   (1.9),   he  would  actually 

want   to  go  to  another   distribution  in  the  true  family      {f(x;a)}      that   he  is 

controlling.      This  involves   a  discrete  jump  in  the  effort  level   and  is   the 

source  of   the   potential   problem  with  the   "first-order   approach." 

So  the   question  is  what  distributions  we  can  add  to   (1.9)   and  still   be 

assured  that  the  agent  would  not  want  to  deviate  to  any  of  them.     Here  is 

the   class   proposed  originally  by  Mirrlees   and  later   verified  by  Rogerson 

(1985b).     Assume  that      {f(x;a)}     satisfies  MLRP  and  that   it  additionally 

satisfies   the  Convexity  of  Distribution  Function  Condition   (CDFC): 

(1.10)  F(x;Xa    +    (1    -    ;)a')   £  AF(x;a)    +    i^    -    X)    F(x;a'),      ¥   a,    a';    Xe(0,1)     . 
What    (1.10)    says   is   that   the   agent   always    has    an   action  available  that 
yields    a   distribution  which  stochastically  dominates    the   distribution  he 
could  achieve   by  randomizing  between  the   two  actions      a     and     a'      (in  other 
words   a  peculiar  sort   of   diminishing  stochastic  returns   to  scale);      LDFC   is 
obviously  a  special   case  of    (1.10). 

Now,   let   us   see  why  this  restriction  will   do  the  job-      The  optimal 
scheme  that   obtains   with  the   local    family  of    distributions    (1.9)    is 
different! able.      From   this  follows,   using   integration  by   parts: 

(1.11)  ;u(s(x))    f(x;a)    dx  -   c(a)    =  K  -   /u'(s(x))   s'(x)   f(x;a)   dx  ^   c(a)    , 
where     K      is   an   integration   constant.      Because  of  MLRP,      s'(x)    >   0     and  so 
by  CDFC,   the  right   hand  side   is   a   concave  function  in     a.      Consequently, 
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none  of   the  distributions   in  the  original   family  will   be  as   appealing  to  the 
agent   as    the  action  the   principal    is   implementing  from   the  local    family 

(1.9).        Hence,     3(x)     remains   optimal    in  the  extended  family  as   well. 
This   argument   is  illustrated  in  the  picture  below. 
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The  triangle  represents   the  simplex  of   all   distributions   in  the   case 
where  there  are  only  three  possible  outcomes     x    ,   x    ,   and  x    ,     which  we 

assume  for   ease  of   diagramming.      One  axis  measures     p^  ,   the  other     p-;    the 

third,     p-   -  1    -  P,    *  Pp     does   not  appear  in  the  picture.     The   curved  line 

CBD  is  the  one-dimensional   manifold  of   distributions     f(x;a)    (here 
represented  as    {(p.  (a),   p^{a))\3t.zA}) ;    this  set   is  one- di  mens  i  onal ,   because 

the  action,     a,    is   a  scalar.      Any  straight  line  in  the  simplex  represents   a 

family  satisfying  LDFC.     The  shaded  region  is   the  set     P     of   all 

distributions    that  the   agent   has    access   to  when  randomized  strategies    are 

included    (cf.   the   general   distribution  formulation  in  1.2).      The   picture 

does   not  show  the   cost  function  and  the  incentive  scheme.     With  a  third 

dimension  measuring  costs   and  rewards,    the   incentive  scheme  would  be   a 

3 
hyperplane  and  the   cost  function  a  convex  manifold  in     R   . 

Assume  the  principal   wants   to  implement  the   distribution  at  point  B 

(representing  the   earlier      a*).      Corresponding  to  the   argument   above,   he 

starts   by  designing  a  cost  minimizing  scheme  that   implements   B  when  the 

agent's   hypothetical   alternatives   are  the    distributions    along  the  tangent   to 

B;    (the   tangent   represents   the   distributions    in  the   linear   family    (1.9)). 

This   cost  minimizing  scheme  is   characterized  by    (1.7).      Next,   CDFC  and  MLRP 
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assure   (using   (1.7)   aind   (1.11))   that  none  of   the   distributions   along  the 
curved  line    (or   in  P   for   that  matter)    is   as    attractive   to  the   agent   as   point 
B  given  the  scheme  in    (1.7).      Thus,    B   is   indeed  implemented  in  the   actual 
set   of  feasible  distributions   P.     Without  CDFC  and  MLRP,   the   agent  might 
want   to  jump  across,    for   instance   to  C,   when   B  is   being  implemented  from   the 
tangent  set.      Then    (1.7)   would  not   be   valid. 

As  might   be   expected,   MLRP  and  CDFC  are  very  restrictive   conditions    and 
economically  rather   peculiar.      Particularly,    CDFC  seems   to  rule  out    a  number 
of   "natural"    families,   because  few   of   those  we  might   think  of    are  closed 
under   convex  combinations.      For   instance,   there  is  no  family  we  know  of   that 
satisfies   both  conditions   and  is  generated  from  the  technology     x  =  a  +   6 
(or     X   -  ae) .     This  does   not  mesm  that  the  set   of  families   that  satisfy  both 
CDFC  and  MLRP   is  small.     There  is  an  easy  way  of   generating  sample  families 
with  both  properties.      Simply  start   with  any  two  distributions   and  extend 
this  family  by  LDFC  as   in   (1.8).      If  the  two  initial   distributions   can  be 
ordered   by  MLRP,    the  extended  family  will   have  this   property.      Note  that   the 
role  of  MLRP   is  here  Just  to  get  the  resulting  schedule  increasing  and  not 
to  assure  the  validity  of   the   "first-order   approach"   which  is   already 

3 

guaranteed  by   LDFC. 

The  fact   that   LDFC  appears    to  be   the  main  instrument  for   constructing 
families   with  CDFC  and  MLRP   leaves   open  the   question  whether  there  are  any 
interesting  cases   that   do  not   satisfy  LDFC  but  merely  CDFC.      Except   for 
added  convenience   in  studying  examples,   this   issue   is   not   terribly 
interesting  either.     We  already  saw  how  the   two-action  case  was   rather  rich 
in  generating  a  variety  of  optimal    incentive  schemes.      This  richness 
obviously  carries   over   to  the   LDFC  case. 
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From  the  preceding  discussion  one  should  infer  that  the  first-order 
approach  works    in  the   case  where  the  family  of   distributions   that  the   agent 
controls   is   one- dimensional    in  distribution  space    (LDFC).      It  also  works    in 
cases  which  are  effectively  one- dimensional    in  the  sense  that  their  solution 
is   equivalent  to  a  problem  with  a  one-dimensional   family   (CDFC  plus  MLRP). 
Notice  that  it  is  one- dimensionality  in  distribution  space  that  makes   things 
simpler,   not   one- dimensionality  in  the  underlying  economic  variable 
(effort).      Even  though  effort   is  being  taken  to  be  one-dimensional ,   the 
curve  it  traces   will   in  general,  when  convexified,   generate  a  higher 
dimensional     P,   making  matters   complex. 

What   is  meant   above  by   "the  fir3t**order   approach  works"   also  needs   a 
bit  of   elaboration.      Its   precise  meaning  is  that  the  optimal  scheme  is 
characterized  by    (1.7),   which  is  a  narrower  statement  than  that  one  can 
describe  the   agent's   choice  by  first-order   conditions.      Looking  at  things   in 
distributional   terms,   we  note  that  the   agent   in  the  picture  above  has   two 
decision  variables:      p       and     p    .      If  the   cost  function  over     P     were 

strictly  convex  and  the  optimal    distribution  to  implement   were  interior   to 
P    (say,   because  the   cost   goes    to  infinity  towards    the   boundary),   then  a 
first-order   approach  in  the  traditional   sense  would  work  perfectly  well. 
NormcLLly  a  single  first-order   condition  woiild  not   be  enough  to  describe  the 
agent's  behavior,   but  two  would  always   do.      One  would  then  get   a 
characterization  like    (1.7),   but   with  two  multipliers     u.      and     u^     r-ather 

than  one.      This  dilutes   the  information  content  of   the   characterization;    the 
sufficient  statistic  results   will   not  be   as   crisp    (in  particular,   optimal 
incentive  schemes  may  aggregate  more  than  what   the   earlier  sufficient 
statistic  result  indicated;    see  section  1.5)    and  statements    about 
monotonicity  will    be   hard  to  make.      Needless    to  say,   when  one  goes   to  higher 
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dimensional   cases,  the  value  of   a  general   characterization  along  these  lines 

,     ,  -,       ^1  4a, 4b 

quickly  disappears.     ' 

We  conclude  that  models   with  a   continuous   effort   variable  allow  a 

simple  characterization  when  they  look  much  like  the  two  action  case 

discussed  before.      In  that   case  the  solution,   as   far   as   the  optimal   reward 

structure  is   concerned,   exhibits   the  same  featiires   and  the  same  variety. 

One   difference   is  worth  stressing,    though.      In  the  two  action  model    it  is 

difficult  to  say  anything  about  the   agent's   choice  of   action,   because  it  is 

not    determined  by  a   continuous    trade-off.      One  has   to   compare  the  solution 

that   implements     H     with  the  solution  that   implements     L     directly.      On  the 

other   hand,   if   effort   is  a  continuous   vairiable  and  the  "first-order 

approach"   works,   then  it   can  be  proved   (Holmstrom,    1979)   that  the  optimal 

level   of   effort  to  implement   is  such  that  the  principal  would  like  to  see  it 

go   even  higher.      In  other  words,    in  equilibrium  we  should  see  principals 

desiring  more  effort  from  their  workers.      Since  this  enrichment   can  be  had 

already  by  moving  from  the  two  action  case  to  the  LDFC   case,   there  appears 

to  be   little  reason  ever   to  go   beyond  LDFC   in  a  model   that   wants    to  exploit 

the   characterization  in   (1.7). 

1.5     An  Intermediate  Assessment 


The  main  predictive   content   of   the   basic  agency  model    is   in  the 
sufficient   statistic  result,  which  tells  what   information  should  enter   into 
a  contract   in  the  first   place.      Simple  as   it  seems,   this   result   turns   out   to 
have   quite  a  bit  of  economic  scope.      One  trivial   implication  is   that   agency 
relationships    create   a  demand  for  monitoring.      This   has    generated 
substantial    interest  in  the   accounting  literature  and  led  to  various 
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refinements   in  predicting  the  usefulness   of   different  monitoring  schemes 
(for   a  survey,   see   Baiman    (1932)). 

A  more  significant   implication  concerns    the   use  of   relative   performance 
evaluation   (Baiman   and  Demski    (1980),    Holmstrom    (1932a)).      Agents   who  work 
on  tasks   that   are  related  in  the  sense  that   one  task  provides   information 
about   the  other,   should  not  be   compensated  solely  on  individual   output,   but 
partly  on  the  output   of   others.      Note  that  the  reason  for  this    (according  to 
the  sufficient  statistic  result)   is   not  that   one  would  like  to  induce 
competition  for   incentive   purposes,   since   if   the   agents'    technologies   are 
not   stochastically  related,   relative   performance  evaluation  is  useless   at 
best.      Rather,   competition  is  a  consequence  of   the   desire  to  extract 
information  about  the   circumstances   under  which  the   agents   performed.     This 
information  is   used  to  filter   out   as  much  of   the  exogenous   uncertainty  as 
possible  so  as   to  allow  more  weight   on  individual   performance. 

A  further   consequence  of   the  sufficient  statistic  result  is  that 
sometjLmes   aggregate  information  will   do  as   well   as   detailed  information  in 
relative   performance  schemes.      For   instance,    if    technologies   have  normal 
noise,    then  weighted  averages   of   peer   performance  will   suffice   as   a  basis 
for   an  optimal    scheme.      The  weights    are  proportional    to  the  information 
content   of   the  signals   from   peers. 

Predictions   like  these  accord  at  least  broadly  with  stylized  facts. 
Relative   performance  evaluations   are   commonplace,    particularly  in  the  form 
of   prizes    (for   instance   promotions)   awarded  to  top  performers   in  an 
organization.      Indeed,   the  labor  market   as   a  whole  forms  a  grand  incentive 
structure  in  which  relative  evaluations   implicitly  or   explicitly  play  a 
dominant   role.      The  literature  on  rank  order  tournaments,   initiated  by 
Lazear   and  Rosen    (1981),    has   studied  in  more   detail   the   performance   and 
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design  of   such  contests    (see  also  Green  and  Stokey    (1983)   and  Nalebuff  and 
Stiglitz    (1983)).     We  note  that  the   use  of   rank  order   as    a  basis  for   payment 
is  rarely  optimal    in  the   basic  agency  model;    one   could  usually  do   better 
with  schemes  sensitive  to  cardinal   measures.      However,   there  may  be  other 
advantages   to  rank  order  payments   not   captured  by  the  standard  agency  model. 
One  reason  is  that  rank  is  easier  to  measure  in  many  circumstances.      Another 
argument,   suggested  by     Carmichael    (198^4),   Malcomson   (198^)    and  Bhattacharya 
(1983).   is   that  tournaments   provide  the  principal   with  incentives   to  honor 
promised  awards    even  in  cases   where  legal    enforcement   is   difficult,   because 
performance   can  be  observed  but  not   verified.      In  tournaments   the  total 
amount   paid  by   the   principal   remains   constant   and  payment  should  therefore 
be  easy  to  verify. 

Explicit  relative   p>erformance  schemes  have  recently  emerged  in 
executive   compensation  packages   as   well.     Typically,   they  relate  managerial 
performance  to  companies  within  the  industry,   which  fits  the  notion  that 
stochastically  closer  technologies   have  ,more  value   as   a  basis  for   optimcil 
rewards.      Antle  and  Smith   (1985)    have  studied  more  broadly  the   degree  of 
relative   performance   evaluation  in   executive    compensation,   measuring 
implicit   (as   well   as   explicit)   contractual   elements.      Their  statistical 
tests  show  that  the   data  in  fact   exhibit   a  component   of  relative 
compensation,   but  not   to  the  extent   predicted  by  the  basic  theory.      This 
seems   puzzling  at   first,   but  two  explanations    can   be  suggested  for  the 
evidence.      First,   executives  may  be   diversifying  their  portfolio  through 
personal   transactions   in  the  market,   which  do  not   show   up  in  the   data;    in 
fact  the  next  section  will   discuss   a  model   with  precisely  the  property  that 
no  relative   performance   payments    are  necessary,    because  the  executive   can 
manufacture  them   himself.      The  other,   more  plausible  reason,    is   that 
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relative   performance  evaluations   distort   economic  values   and  thereby 
decision-making    (e.g.    an   executive   completely  insulated  from   systematic 
risk,   will   not   care  about   it  in  evaluating  investment  decisions).      In  the 
one- dimensional   agency  models  normally  studied,  such  decisions   are  excluded. 
Including  more  decision  dimensions   in  the  model   seems   essential   for  gaining 
a  better  fit  with  the   data  and  a  better   understanding  of   the  merits   of 
relative  performance  schemes. 

Given  that  the   basic  agency  model    is   so  general,    it   is   perhaps 
surprising  that   it  has   any  predictive   value   at   all.     To  this   can  be   added 
the   value  of   having  a  paradigm  within  which  one  can  start  to  consider  in 
more  precise  terms  such  subjects   as   the  managerial   theory  of   the  firm. 
Jensen  and  Heckling' s    (1976)   pioneering  work  is  an  example  of  what   insights 
one  might   be   able  to  derive  from   the  mere  recognition  that  managers  need  to 
be  provided  with  incentives   against  shirking;    another  more  explicit  model  on 
the  same  subject   is   in  Grossman   and  Hart    (1982).      Both  papers    derive  the 
capital   structure  of   the  firm  from   the   underlying  incentive   problems    (with 
opposite  hypotheses    about   the  manager's   options   to  dilute  the  firm's 
resoa^ces).      While  these  studies    beg  the    question  why  capital    structure 
needs    to  be   used  for   incentive   purposes   when   direct   incentive  schemes   would 
appear   cheaper,   they  still   open  the   door  for  further  investigations   into  a 
subject   that  surely  is   of   substantial    economic   importance. 

Let   us   next   turn  to  the    problems   with  the   basic  agency  model-      The  main 
one  is   its  sensitivity  to  distributional    assumptions.      It  manifests   itself 
In  an  optimal   sharing  rule  that   is   complex,   responding  to  the  slightest 
changes   in  the   information  content   of   the  outcome     x.      Such  "fine-tuning" 
appears   unrealistic.      In  the  real   world   incentive  schemes   do  show  variety, 
but   not   to  the   degree  predicted   by   the   basic  theory.      Linear   or    piece-wise 
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linear  schemes,   for  instance,   are  used  frequently  and  across   a  large  range 

4c 
of  environments.        Their  popularity  is   hardly  explained  by  shared  properties 

of   the  information  technology  as   the  basic  model   would  have  it.      It  is   clear 

that   other   technological   or  organizational   features,   excluded  from   the 

simple  model,   must   be  responsible  for  whatever  regularities   in  shapes   we 

do  observe  empirically. 

Fine-tuned,    complex  incentive  schemes   also  stand  in  the  way  of   serious 
extensions    and  applications.      One   can  say  little   about   comparative  statics 
properties   of   the  model   and  it  is  also  hard  to  introduce   additional 
variables   into  the  analysis.     This   is  a  critical   drawback,   because  the 
unobservable  variable  in  the  model    (say  effort)   is  not  of  primary  interest 
precisely  because  it  cannot  be  observed.      Instead  one  would  be  interested  to 
know  what   consequences   the  agency  model  has  for  such  observable  variables   as 
investment   decisions   and  task  assignments,   for   instance.      Little  has   been 
done  in  this  regard,   because  of   the   complexity  of   the   basic  solution.      (For 
one  attempt   that  reveals   these   difficulties,   see  Lambert    (1986).) 

Thus,   casioal   empiricism  as   well   as   the   desire  to  include   decision 
variables   of   allocational   and  aggregate  significance,   strongly   point   to  a 
need  to  refine  agency  models  in  the   direction  of   predicting  simpler 
incentive  schemes.     We  turn  next   to  such  an  effort. 

1.6      Robustness   and  Linear   Sharing  Rules 

The  prevalence  of   relatively  simple  incentive  schemes   could  partly  be 

5 
explained   by  the   costs    of  writing  Intricate   contracts.        But   that   is   hardly 

the  whole  story.      A  more  fundamental   reason  is   that   incentive  schemes   need 

to  perform  well    across   a  wider  range  of    circumstances    than  specified   in 


3tandard  agency  models.      In  other  words.   Incentive  schemes   need  to  be 
robust. 

One  way  of    expressing  the    demand  for  robustness    is   to  allow  the   agent   a 
richer  set   of   actions   or  strategies.      Intuitively,   the  more  options   the 
agent   has,   the  poorer  intricate  schemes   will   perform.     To  give  a  familiar 
example:      if   there  is   a  secondary  market   for   goods,   arbitrage  will   take   away 
all   opportunities  for   price-discrimination.      Linear  schemes   are  optimal, 

because  they  are  the  only  ones   that   are  operational. 

Another   elementary  example  of   how  added  options   contribute  to 
simplifications   can  be   given  in  the   context   of   our   basic  agency  model.     We 
noted  that   an  optimal    incentive  scheme  need  not   be  monotone  in  general 
unless  MLRP  holds.      On  the  other  hand,   if   the   agent   is  allowed  free  disposal 
of   output,   then  the  only  operational   schemes   are  monotone  no  matter  what  the 
stochastic  technology  looks   like.      This  illustrates   the   kind  of  non- 
distributional   considerations   that   one  is  led  to  look  for  in  understanding 
more  universal   properties   of   incentive  schemes. 

Recently,    Holmstrom   and  Milgrom    (1935)    have   proposed  a  simple  agency 
model    in  which   linear   schemes   are  optimal    because  the   agent    is   assumed  to 
have   a  rather   rich  action  space.      The  main  idea  can  best  be   grasped  by 
describing  an   example,    due  to  Mirrlees    (197^),    in  which  no  optimal   solution 
exists.     Mirrlees'    example  has   a  risk  neutral   principal,   an  agent   with 
unbounded  marginal   utility  for   consumption  and  a  technology  with  output     x   - 
a   +   6,    where      6     is   a  normally  distributed  error   term  with  zero  mean   and     a 
is  the   agent's  labor  supply.      In  other  words,   the   agent   controls  the  mean  of 
a  normcLlly  distributed  output.      This   technology  is   the  most  obvious 
candidate  for   an   agency  analysis   and  it   is    quite  a  shock  to  learn  that   the 
problem   has   no  solution.      The  reason  is   that   first-best   can  be   approximated 
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arbltrarlly  closely  by  step-function  schemes   that  offer  first-best  risk 

sharing    (a  flat   reward)    for   almost   all   outcomes    except   the  extremely  bad 

ones  for  which  a  severe  punishment   is  applied.      This   approximation  result   is 

in  fact   easy  to  understand  using  the  statistical   intuition  that  the   basic 

model   offers.      The  normal    technology  has   a  likelihood  ratio     f   /f     that   is 

a 

unbounded  below    (it   is   linear   in     x).      Therefore,    very  low  x**values   will   be 
very  informative  about  the   agent's   action  and  one  can  act   on  that 
information  almost  as   if   it  revealed  compliance   perfectly.      The  step- 
functions   approximate  forcing  contracts,   which  are  well*-know  to  be  optimal 

7 
if   there  are  outcomes   that  reveal   deviations  with  certainty. 

The  example  is   clearly  unrealistic  and  there  are  ways   to  patch  it   (e.g. 
bound  utility  or  bound  the  likelihood  ratio).      But  this  would  be  misleading, 
because  the  example  points   to  a  more  fundamental   flaw.      Step-functions   come 
close  to  first-best  only  under  the  unrealistic  assumption  that  one  knows 
exactly  the   parameters   of   the  problem    (utility  functions,  technology,   etc.) 
and  they  will   generally   perform  lousily  as   soon  as   one  introduces   slight 
variations   or   uncertainty  into  the  model.      In  other  words,   the   example 
represents   the   extreme   case  of   fine-tuning  we  talked  about    earlier. 

For   instance,    think  of   a   dynamic   context,   where  the   agent   is   paid  at 
the  end  of   the  week  say,   and  assume  he   can  observe   his   own  performance 
during  the  week  so  that   he   can  adjust  his  labor   input   as   a  function  of   the 
realized  path  of   output.      Then  step-functions   will    induce   a  path  of   effort, 
which  will    be   both  erratic  and  on  average  low;    (generally,    the   agent   will 
bide  his   time  to  see  if   there  is  any  need  at   all   to  work).      In  contrast,   a 
linear   scheme,   which   applies    the  same  incentive   pressure  no  matter   what   the 
outcome  history  is,   will   lead  to  a  more  uniform   choice   of   effort.      This 
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Q 

assumption  that   the  agent   chooses   his  labor  input   only  once. 

This   intuition  can   be  made   precise  by  considering  a   dynamic  version  of 
the  normal    example.      Specifically,    let  the   agent   control   the   drift  rate     y 
of   a  one- dimensional   Brownian  motion      {x(t);    t    c   [_0,    1j  }      over   the   unit   time 
interval.      Formally,   the  process     x(t)     is   defined  as   the  solution  to  the 
stochastic  differential    equation: 

(1.12)  dx(t)    -   y(t)    dt    +    0   dB(t),    t    e   [o,    T\     . 

Here  B  is   standard  Brownian  motion   (zero  drift  and  unitary  variance).      Note 
that  the  instantaneous   variance,    odt ,    is   assumed  constant. 

The   agent   in  the  model   is   assumed  to  have  an  exponential   utility 
function  and  the   cost  of   effort   is,   unlike  in  our  earlier  model,   assumed  to 
be   independent   of   the  agent's  income.      In  other  words,  the   agent's  payoff 
is: 

(1.13)  u(s(x)    -   /c[u(t)])    =  -exp{-r(3(x)   -  /cCy(t)])} 

as    evaluated  at  the  end  of   the   horizon,   where     x   «=  x(1)      is   the  final 
position  of   the   process    (the   profit  level   at  time   1,    say),     c(u)      is   a 
convex    (instantaneous)    cost  function  and     r      is   the   coefficient   of    absolute 
risk  aversion.      The   particular   form   of    the   utility  function  assures   that   a 
linear   scheme  will    indeed  apply  the  same  incentive   pressure  over   time.      In 
general    income  effects   would  cause  distortions. 

Notice  that   if  the   agent   were  unable  to  observe  the  path     x(t),   then  it 
would   be  optimal    for   him  to  choose  a   constant   drift  rate      u(t)    =  u      (because 
c(*)      is   convex)   and  the  end  of   period  position     x     would   be  normally 
distributed  with  mean      u     and  variance      o.      In  other  words   we  would  have   a 
model    identical    to  the  earlier   discussed  one-period  example  that   has   no 
optimal   solution,    because  stef>-functions   approximate  first-best.     When  the 


agent   can  observe     x(t)     and  base  his   choice     y(t)     on  the   history  of   the 

path  of     x(t)      (which  we  will   denote     x    ),   the  situation  is  significantly 
changed.      Instead  of   being  constrained  to  a  one- parameter  family  of  outcome 

distributions,   the  rich  set   of   contingent  strategies,      {u(x    );    tELp.lJ}, 
permits   a  vastly  wider   choice.      The  enormous   expansion  of   the   agent's 
opportunity  set  limits   the  principal's   options   dramatically;    in  fact,   for 
each  strategy  that  the  principal   wants   to  implement   there  is   essentially  a 
unique  incentive  scheme  that  he  must  use,   which  stands   in  sharp  contrast  to 
the  lisual   flexibility  in  choice  that  the  principal   has   in  one-dimensional 
static  models. 

The  one-to-one  mapping  between  strategies   and  sharing  rules  makes   the 
model   solvable  technically;    (recall   the   discussion  in  section  I. -4).     The 
relationship  can  be  written  out   explicitly  and  after  that  it  is  easy  to  show 
that  the  optimal   rule  is  linear.     The  interested  reader  is  referred  to  the 
original   paper  for   details. 

Intuitively  the  result  can  be  seen  as  follows.     Consider   a  discrete 
version  of   the   Brownian  model,   one  in  which  the   agent   controls   a  Bernoulli 
process.      Because  of    exponential    utility  it   is   easy  to  see  that   the  optimal 
compensation  scheme,    if    it   could   be  made    contingent   on  the  whole  path  of 
periodic  outcomes,   would  be  to  pay  the   agent   the  same  bonus   each  time  he  has 
a  "success";    the   problem   is  stationary,   because  there  are  no  income  effects. 
Viewed  as    an   end-of-period  payment  scheme,   this   rule  pays   the   agent   a 
constant   plus    the  number   of   successes   times   the   bonus,   which  amounts    to  a 
linear  scheme  in  end-of-period  profits.      The   Brownian  model,   being  the  limit 
of    a  Bernoulli   process,   should  therefore  be   expected  to  have   a  linear 
optimum   as   well    and   it   does    indeed. 
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Notice  that  this  line  of  reasoning  shows   that  the  principal   need  not 

use  the   detailed   information  of   the   path  of    the  outcome  process   even   if   he 

has   access   to  it.     This  is  a  case  in  which  an  insufficient  statistic  with 

respect  to  the   agent's   distributional   choice    (the  end-of-period  level  of 

profits)   is  still   enough  for   constructing  an  optimal   rule;    in  other  words,   a 

case  in  which  the   principal   uses  more  aggregated  information  than  the 

sufficient   statistic  results   of   one- dimensional   models   would  suggest.      The 

reason  is  that  there  is  no   conflict   of   interest  in  the  timing  of   effort, 

only  in  the   aggregate  level   of   effort;    hence  information  about  timing  is  of 

9 
no   val ue - 

The  remarkable  thing  about  this  model   is  that   by  making  the  incentive 
problem   apparently  much  more  complicated   (the  rigorous   proof  that   a  linear 
scheme  is  optimal    is  non-trivial),   it  delivers   in  the  end  a  much  simpler 
solution.      In  fact,  once  we  know  that  the  optimal   incentive  scheme  is  linear 
it  is   trivial   to  solve  for   its   coefficients.      A  linear  scheme  will   induce 
the   agent   to  choose  a  constant   level   of    effort.      Therefore  we  can  treat   the 
problem   as    a  static  one   (cf .   the   discussion  above)   in  which  the   agent 
chooses    the  mean  of   a  normal    distribution,    but   this    time  with  the   constraint 
that   the   principal    is   only  allowed  to  use  linear  rules.      The   dynamics 
rationalizes   an  "ad  hoc"   restriction  to  linearity  in  the  static  model   and  in 
the  process   resolves   the  non-existence   problan  that  Mirrlees   originally 
posed! 

Computational    ease   gives   the  model   substantial   methodological    value. 
In   contrast   to  general    agency  models   it  is   easy   to  conduct    comparative 
statics   exercises.     More  importantly,   one  can  use  the  model   as   a  building 
block   in  studying  richer   applications   of   moral    hazard.      Such  applications 
are  further   facilitated  by   the  fact   that   the  linearity  results    extend  to 
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situations  in  which  the  agent  controls  the  vector  of  drift  rates  of  a  multi- 
dimensional Brownian  process;  or  in  static  terms,  chooses  the  mean  vector  of 
a  multivariate  normal  distribution. 

As  a  brief  illustration,  let  us  discuss  the  effects  of  agency  costs  on 
investment  decisions,  assuming  that  investments  are  made  jointly  by  the 
principal  and  the  agent.  (We  cannot  let  the  agent  make  the  choice  privately, 
because  that  would  amount  to  having  him  control  the  variance,  which  would 
upset  the  linearity  results.)   Suppose  there  is  a  collection  of  projects 
available  for  investment.   Each  project  returns   x  =  y  +  6  ,  where   6   is  a 

2 
normally  distributed  variable  with  mean  m  and  variance   o   and  u   is  the 

agent's  effort.   For  a  closed  form  solution,  assume  the  cost  of  effort  is 

2 
quadratic:   c(u)  =  p  /2.   To  make  the  example  a  bit  richer,  assume  in 

addition  that  there  is  a  market  index  z,  normally  distributed  with  variance 

2 
T   and  zero  mean,  which  correlates  with  x.   Then  each  project  can  be 

2 
characterized  by  the  triple   (m,  o  ,  p),  where  p  is  the  correlation 

coefficient  between  z  and  x. 

To  determine  the  best  investment  one  solves  first  for  the  optimal 

incentive  scheme  and  net  return  to  the  principal,  given  a  particular 

project.   The  optimal  scheme  is  linear  in  x  and  z,  i.e.  of  the  form 

s(x,z)  =  ax  +  a_z  +  B.   The  best  coefficients  are  easy  to  calculate.   One 

finds  that  the  principal  should  set 

(1.1M)  ct^  =  (T  +  ro^d  -  p^))"""  , 

(1.15)  a^  =  -a^(o/Y)p  . 
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The  constant  coefficient   B   is  determined  by  the  agent's  participation 
constraint.   If  he  has  to  be  assured  a  zero  certain  equivalent,  then  the 
principal  will  be  left  with  an  expected  net  return  equal  to 

(1.16)  IT  =  m  +  (1/2)(1  +  ro^d  -  p^))~^  . 

Note  that  the  optimal  incentive  scheme  exhibits  relative  performance 
evaluation.   The  agent  is  not  merely  rewarded  based  on  the  project  outcome 
X,  but  also  on  the  market  outcome  z.   (The  sign  of  a        is  the  opposite  of 

p   as  one  would  expect.)   This  is  in  accordance  with  the  general  result  that 

an  optimal  design  should  filter  out  as  much  uncontrollable  risk  as  possible. 

2 
Using  z  as  a  filter  reduces  uncontrollable  risk  by  the  factor   (1  -  p  ). 

If  X  and  z  happen  to  be  perfectly  correlated,  all  risk  cam  be  filtered 

out  and  first-best  can  be  achieved.   (In  first-best  a  =  1   and  ir  =  m  + 

1/2.)^° 

The  best  project  is  the  one  that  maximizes  (1.16).   Because  of  the 

agency  problem,  we  see  that  project  choice  depends  on  the  degree  of 

2      2 
idiosyncratic  risk  as  measured  by  o  (1  -  p  )  (which  is  the  conditional 

variance  of  x  given   z).   The  price  of  that  risk  is  a  function  of  the 

agent's  risk  aversion  (and  in  general  also  the  cost  of  effort).   There  is  no 

price  for  systematic  risk,  because  the  principal  is  risk  neutral.   One  could 

allow  a  risk  averse  principal  (with  exponential  utility)  without  altering 

the  linearity  result  and  then  systematic  risk  would  also  enter  the  decision 

criterion.   But  the  main  point  is  that,  unlike  standard  portfolio  theory, 

idiosyncratic  risk  will  play  a  role  in  investment  decisions. 

Because  idiosyncratic  risk  carries  a  price,  diversification  will 

generally  have  value  (see  Aron  (198^)  for  the  same  point).   Also,  a  concern 

for  idiosyncratic  risk  will  give  rise  to  a  market  portfolio  that  is  more 

concentrated  than  under  full  information.   Firms  will  find  value  in  choosing 
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projects  that  are  more  heavily  correlated  with  the  market,  because  that  will 
enable  a  better  incentive  design.   (This  assumes  all  projects  are  positively 
correlated.)   Thus,  agency  costs  could  amplify  aggregate  swings  in  the 
economy. 

This  discussion  is  merely  suggestive  of  what  one  might  be  able  to  do 
when  linear  schemes  are  optimal.   It  appears  that  linearity  has  the 
potential  to  take  us  towards  some  livelier  and  more  serious  economic 
analyses.   (For  some  other  illustrative  examples,  see  the  original  paper.) 
On  the  other  hand,  the  Brownian  model  is  quite  special.   The  technological 
options  are  very  limited;  for  instance,  the  fact  that  the  agent  cannot  be 
allowed  to  make  private  investment  decisions  is  an  unfortunate  constraint 
for  applications.   The  effectiveness  of  the  Brownian  model  is  restricted, 
because  it  does  not  capture  the  demand  for  robustness  in  the  most  intuitve 
way.   Presumably,  one  will  have  to  go  outside  the  Bayesian  framework  and 
introduce  bounded  rationality  in  order  to  capture  the  true  sense  in  which 
incentive  schemes  need  to  be  robust  in  the  real  world. 

1.7  Dynamic  Extensions 

Dynamic  extensions  of  the  basic  agency  model  are  of  interest  for  two 
rather  opposite  reasons.   One  has  to  do  with  the  relevance  of  the  incentive 
issues  portrayed  in  the  static  models,  the  other  with  the  added  predictions 
that  might  be  had  from  introducing  dynamics.   In  the  former  category  we  have 
theoretical  studies  that  suggest  that  time  may  resolve  agency  problems 
costlessly.   This  has  been  argued  both  from  the  perspective  of  supergames, 
in  which  all  cooperative  gains  can  be  realized  between  two  parties,  and  in 
terms  of  reputation  effects  created  by  the  market.   While  we  do  not  concur 
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in  either   case  with  the   conclusion  that   incentive   problems   disappear,   it  is 
worth  understanding  the   arguments.      They  will    take   us    to   dynamic  models   that 
can  expand  and  sharpen  the   predictions  from   the  static  theory. 

The  first  studies    of   dynamic  agency  were  those  by  Radner    (1981)    and 
Rubinstein    (1979).      Both  show  that   in  an   infinitely  repeated  version  of   the 
basic  one-period  model,   the  first-best  solution   (complete  risk-sharing 
together  with   correct   incentives)    can   be   attained  if   utilities   are  not 
discounted.      The   analysis   does   not   offer   an  optimal   solution,   but   rather   a 
class   of    contracts   within  which  firstMbest   can   be  reached.      These   contracts 
operate  like   control   charts,   punishing  the   agent   for  a  period  of   time  if  his 
aggregate  performance  falls   sufficiently   below   expectations.      Over   time,   as 
uncertainty  is  filtered  out   by  the  law  of  large  numbers,   the  punishments   get 
more  severe  and  the   control  region  tighter.      The   assumption  of  no 
discounting  assures   that  only  events   in  the   distant  future,   where  the 
control   is   tight   and  few  violations   occur,   matter. 

These  models  appear  to  formalize  the   intuition  that   in  long-term 
relationships   one  can   cope  more  effectively  with  incentive   problems,   because 
time  permits   sharper   inferences   about   true   performance.      The  fact   that 
first-best   can   be   achieved  is  more  incidental   and  a   consequence  of   the 
unrealistic  assumption  of   no   discounting  paired  with  infinite  repetition. 
Even  though   Radner    (1981)    has   subsequently  shown  that   with  some   discounting 
one  can  still   get   close  to  firsf-best,   there  is  little  reason  to  believe 
that   incentives   are  costless   in  reality.      The  main  question  then   is  whether 
dynamics    alters    the   insights    and  results   from   one-period  models.      In  the 
studies    above,    as   well   as   in  subsequent   work  by  Rogerson    (1985a)   and  Lambert 
(1983)    (see  also  Roberts    (1982)    and  Townsend    (1982)),   memory  plays   a  key 
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role,   suggesting  that   an  optimal   long-term  contract  might   look  rather 
different  from   a  sequence  of   short-term   contracts. 

Jumping  to  such  a   conclusion  is   pranature,   however.      The  models 
discussed  above   assume  that  the   agent   cannot   borrow  and  save  in  which   case 
long-term  contracts   substitute  in  part  for  self-insurance  that  would  in  fact 
be  available  to  agents    (saving  is   certainly  a  real   option  and  limited 
borrowing  as  well).     Could  it  be  that  the   gains   to  long-term   contracting 
identified  in  the  early  models   are  in  fact   due  to  restrictions   on  borrowing 
and  savings? 

Recent  studies   by  Allen    (1985),   Mcilcomson  and  Spinnewyn    (1985)   and 
Fudenberg  et   al    (1985)   show  that  this  may  indeed  be  the   case.     More 
specif  icailly,   if  one  goes   to  the  other   extreme  and  assumes   that  the   agent 
can  access   capital   markets  freely  and  on  the  same  interest  terms  as   the 
principal,   then  long-term  contracts  will   be  no  better  than  a  sequence  of 
short-term  contracts   in  the    (independently)  repeated  model. 

For   instance,    Allen  noted  that   if   there  is  no   discounting,   then  one   can 
simply  appeal   to  Yaari's    (1972)   early  work  on  consumption  under   uncertainty 
to  conclude  that   a  first-best  solution  can  be   achieved  by  having  the   agent 
rent  the  production  technology  from  the  principal   at   a  fixed  price.     The 
agent,  by  borrowing  and  saving,   need  not  be   concerned  about  fluctuations   in 
income,   since  they  can   be  smoothed  out   at   no   cost.      In  this   case  self- 
insurance  is  perfect   and  risk  carries   no  premium. 

-Allen  also  studies   the  finite  horizon  case,   but   in  a  pure  insurance 
context    (specifically  Townsend's    (1982)   model),   which   is  simpler  than  the 
agency  model   we  have   been  discussing.      Also  here  he  finds   that  long-term 
contracts    do  not    dominate  short-term   contracts.      The  same  results   for   the 
agency  model    are   established  by  Malcomson  and  Spinnewyn   and  Fudenberg  et    al . 


These  two  papers    differ  in  that   the  former   assumes    that   the   agent's 
borrowing  and  saving  decisions   can  be   verified   (and  hence  his   consumption 
can   be   controlled  contractually,)   while  the  latter  treats   these  decisions   as 

private  to  the   agent..         The   basic  idea  of   the   argument   is   very  similar, 
however.      The   key  observation  is   that   long-term   contracts    can  be   duplicated 
by  a  sequence  of   short-term   contracts    by  rearranging  the   payment  stream   to 
the   agent   without    altering  its   net   present   value   along  amy  realized  path. 
Roughly  speaking,   the  rearrangement   works   so  that  the  principal   clears  his 
balance  with  the   agent   in  utility  terms  each   period.      Since  there  is  a 
capital   market,   the  timing  of   payments   does   not  matter.      The  agent   gets   back 
to  the  consumption  stream  implied  by  the  long-term   contract  by  borrowing  and 
saving  appropriately. 

Of   course,   the   assumption  that  the  agent   can  borrow  and  save  freely  in 
the   capital   market   is  rather   unrealistic.      (In  addition,   the  Fudenberg  et   sd. 
model   assumes   that  the  agent   can   consume  negative   amounts,   which  certainly 
is   unrealistic.)      Nevertheless,   the  models    do  make   clear   that   one  should  not 
rush  to  the    conclusion  that   long-term   contracts,   at   least   in  repeated 
settings,   have  substantial    benefits;    in  some  situations,   the   insights   of    the 
one-period  models   remain  unaltered  with  the   introduction  of   dynamics.      More 
importantly  though,   these  findings  suggest   that  since  we  do  observe  long- 
term  relationships   and  long-term   contracts,   some  other  forces   than  income 
smoothing  are  likely  to  be   behind  the   benefits. 

There   are  many  potential   reasons   one   could  think  of.      Informational 
linkages   between  periods   are  discussed  in  Fudenberg  et    al    and  some  other 
reasons   will    be  taken   up  in  Part   III.      Here  we  want   to  stress   that   when 
contingencies    are   hard  or   impossible  to  verify  so  that    explicit   contracts 
cannot   be   easily  enforced,   long-term   relationships    are  likely  to  provide 
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major  advantages.      They  can  implicitly   (via  reputation  effects)   support 
contracts     which  may  be   infeasible  to  duplicate  in  short-term  relationships. 
Bull    (1983)   offers   a  model   of   this   variety,   which  we  will   come  back  to  in 
Part  III.      Radner's   and  Rubinstein's  models   are  also  best  interpreted  in 
this  fashion;    both  have  self -enforcing  equilibria  which  do  not   require 
outside  enforcement.      Lazear's    (1978)   model   on  mandatory  retirement  is  in 
the  same  vein.      Lazear   argues   that   age-earnings   profiles  slope   upwards    (as 
an   abundance  of   empirical    evidence   corroborates;    see  however,    Abraham  and 
Farber    (1986)    for   contradicting  evidence),   because  that  way  incentives  for 
work  are  maintained  over  the  agent's   employment   horizon.     The  implication  is 
that  termination  of   auployment  should  be  mandatory,   because  marginal   product 
will   be  below  pay  at  later  stages   in  the   career.     While  the  argument  needs 
some  refinement,    Lazear's  model   serves  well  as   an  illustration  of  how 
introducing  dynamics   can  yield  additional  predictions   into  the   basic  agency 
set-up. 

As   a  related  example  of  reputation  modelling,   let   us   consider  Fama's 
(1980)    argument  that  incentive  problems,   particularly  managerial   incentive 
problems,   are  exaggerated  in  the   agency  literature,   because  in  reality  time 
will   help  alleviate  them.      His  reasoning  is   different  from  Radner's  and 
Rubinstein's  in  that  it  focuses   on  the  power   of   the  market  to  police 
managerial    behavior,   rather   than   on  the  theory  of   supergames.         Fama   coins 
the  term   "ex  post  settling  up"    for   the   automatic  mechanism   by   which  managers' 
market   values,    and  hence  their  incomes,   are  adjusted  over   time  in  response 
to  realized  performance.      If  there  is  little  or  no   discounting,   then  the 
manager  will   be  held  fully  responsible  for   his   deeds   through  his  life-time 
income  stream   and,    Fama   claims,    induced  to  perform   in  the  stockholders' 
interest. 
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Fama's  intuitive  argument  has  been  formalized  in  Holrastrom  (1982b).   We 
will  sketch  the  construction  partly  to  indicate  that  the  first-best  result 
hinges  on  very  special  assumptions,  but  also  because  the  model  offers  the 
simplest  illustration  of  reputation  formation  and  suggests  some  interesting 
extensions. 

Consider  a  risk-neutral  manager  who  operates  in  a  competitive  market 
for  managerial  labor.   Assume  the  market  can  follow  the  manager's 
performance  over  time  by  observing  his  periodic  output.   At  the  same  time, 
assume  that  the  manager's  fee  cannot  be  made  contingent  on  output,  because 
enforcing  third  parties  cannot  verify  the  output.   Therefore  the  manager 
will  be  paid  his  expected  marginal  product  in  each  period. 

Obviously,  if  the  world  only  lasted  for  one  period,  the  manager  would 
have  no  incentives  to  put  out  extra  effort.  But  if  he  wishes  to  stay  in  the 
profession  longer,  matters  are  different.   Prospective  employers  will  follow 
the  manager's  performance  and  forecast  his  future  potential  from  past 
behavior.   Logically,  this  means  that  there  must  be  some  characteristic  of 
the  manager  that  is  not  fully  known  to  the  market  and  which  is  being 
signalled  by  past  performance.   For  managers,  competence  or  talent  is  a 
natural  candidate  for  what  is  being  signalled,  though  many  other 
alternatives  could  also  be  considered. 

Let  us  now  see  how  the  uncertainty  about  the  manager's  competence  will 
induce  effort  even  though  there  is  no  explicit  contract. 

In  the  simplest  setting  the  manager  controls  a  linear  technology: 
(1.17)  x^  =  a^  *  n^  *  e^  , 

where  x   is  output  in  period  t,  a   is  the  manager's  effort,  t\        is  a 

quantified  measure  of  managerial  competence  and   6   is  a  stochastic  shock 


term  with  zero  mean.   Managerial  competence  progresses  over  time  according 
to  a  simple  auto-regressive  process: 

\-1  '  \  "  "t  • 
The  role  of  this  process  will  become  evident  shortly. 

In  each  period  the  manager  will  be  paid  his  expected  marginal  product. 
This  is  the  sum  of  his  expected  competence  as  assessed  on  the  basis  of  past 
performance  and  the  value  of  his  effort   a  .   Since  the  market  is  asiumed  to 

know  the  utility  function  of  the  manager,  it  can  forecast  the  manager's 
choice  of  a.  . 

To  find  out  what  the  manager  will  do  in  equilibrium  and  what  he  will  be 
paid,  one  has  to  solve  a  rational  expectations  equilibrium.   This  is 
relatively  easy  if  the  shock  terms   6  and  e     are  normal  and  the  prior  on 
competence  is  also  normal.   Then  the  market  will  be  monitoring  a  standard 
normal  learning  process  (see  DeGroot  (1970))  in  which  assessments  about 
competence  are  updated  based  on  a  weighted  average  of  present  beliefs  and 
the  last  observation  of  output.   If  we  denote  by  m   the  expected  value  of 

T\        based  on  history,  the  m   progresses  as: 

(1.18)  m^^^=  a^m^  *    (1  -  a^)(x^  -  a^)  . 

Note  that  the  market  in  updating  beliefs  about  competence  will  subtract  from 
output  the  present  level  of  effort,  which  it  can  infer  in  equilibrium.  This 
filters  out  time-varying  transient  effects. 

The  weights  a.      are  deterministic  functions  of  time  and  converge  to 

some  equilibrium  value  a  e:(0,1)   in  the  long  run.   The  value  of  a  depends 
on  the  distribution  of  the  stochastic  shook  terms.   If  competence  stays 
constant  (i.e.   var(e  )  =  0),   then  a  =  1.   In  general   a  is  smaller  the 
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more  noisy  the  competence  process  is  relative  to  the  noise  in  the  output 
process;  i.e.  the  stronger  the  signal-to-noise  ratio  is. 

Given  that  the  market  updates  beliefs  according  to  (1.18)  and  pays  the 
manager  in  proportion  to  m   each  period  t,  it  is  easy  to  calculate  the 

return  from  managerial  effort  in  period  t.   In  a  stationary  state  the 
marginal  return  will  be  given  by 

(1.19)  k  =  6(1  -  a)/(1  -  a&)    , 

where   B   is  the  manager's  discount  factor  and  a  is  the  aforementioned 
long-run  value  of  the  updating  weight.   From  this  we  can  see  that  if   6   is 
close  to   1 ,  then  marginal  returns  to  effort  will  be  close  to  one  both  in 
the  manager's  objective  function  and  the  production  technology,  so 
incentives  will  be  right.   In  general,  though,  effort  will  fall  short  of 
first-best.   It  will  be  lower  the  lower  the  discount  factor  is  and  the  lower 
the  ratio  between  the  variances  of  c      and   6   is,  i.e.  the  more  there  is 
noise  in  the  output  process  and  the  less  there  is  innovation  in  the 
competence  process.   This  is  all  in  line  with  intuition.   If  output  is  very 
noisy,  returns  from  effort  will  be  distributed  further  into  the  future  and 
have  less  value.   On  the  other  hand,  variation  in  competence  will  raise  the 
need  to  reestablish  one's  reputation  and  therefore  increase  effort.   Without 
(1.18),  the  manager's  effort  would  converge  to  zero  deterministically. 

As  in  the  case  of  Radner ' s  and  Rubinstein's  models,  the  result  that 
first-best  can  sometimes  be  achieved  is  of  little  interest  per  se.   It 
requires  very  special  and  implausible  assumptions,  in  particular  that  the 
manager  is  risk-neutral  and  does  not  discount  future  payoffs.   The  main 
point  with  the  model  is  rather  to  illustrate  that  reputation  can  indeed 
enforce  an  implicit  contract  of  some  form  when  learning  about 
characteristics  is  a  key  factor  as  it  often  would  seem  to  be.   In  the 
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particular  example  the  implicit  contract  performs  exactly  like  an  explicit 
contract  would  (in  a  world  with  known  competence)  if  that  contract  were  of 
the  form  s(x)  =  kx  +  b,  where   k   is  given  in  (1.19).   It  is  important  to 
note,  however,  that  when  relying  on  reputation  effects,  at  least  as 
determined  in  the  market,  there  is  little  freedom  to  design  the  contract  in 
desirable  ways. 

Wolf son  (1985)  has  conducted  an  empirical  study  of  the  returns  to 
reputation  in  the  market  for  general  partners  of  oil-drilling  ventures.   The 
results  conform  broadly  with  the  implications  of  the  example.   In  the  market 
for  oil-drilling  ventures  myopic  behavior  would  dictate  that  general 
partners  complete  fewer  wells  than  limited  partners  desire  (because  of  the 
tax  code).   However,  since  new  ventures  come  up  frequently  and  new 
partnerships  are  formed,  one  might  expect  general  partners  to  take  into 
account  their  reputation  and  complete  more  wells  than  would  be  optimal  in 
the  short  run.   Indeed,  Wolfson  finds  statistically  significant  evidence  for 
that  to  be  the  case.   The  market  prices  reputation  much  like  in  the  model 
described.   The  results  correspond  to  a  case  where  k  <  1,  because  Wolfson 
also  finds  that  residual  incentive  problems  remain  and  that  these  are 
reflected  in  the  price  of  the  shares  of  limited  partners. 

These  empirical  findings  give  reason  to  explore  further  the  workings  of 
reputation  and  learning.   The  general  idea  can  be  pursued  in  many  directions 
and  some  interesting  work  has  already  been  done.   Gibbons  (1985)  has 
considered  what  organizations  can  do  to  align  reputation  incentives  more 
closely  with  true  productivity.   It  is  evident  from  the  model  described  that 
there  need  not  be  a  very  close  relationship,  particularly  in  the  early 
periods,  between  the  returns  to  reputation  for  a  manager  and  his  present 
marginal  product.   Indeed,  if  we  think  of  young  mcinagers  in  lower  positions. 


their  retiirns   frcxn  effort  may  vastly  exceed  the   actual   product  of  what  they 
do,   because  the  future  value  of   being  considered  competent  multiplies   in 
general   through  enhanced  responsibility.      One  way  of   coping  with  the 
problem,   suggested  by  Gibbons,   is   to  control   the  flow  of   information  about 
performance   potential   so  that  the  initial   impact   of   performance  is 
diminished.      Perhaps    the   phenomenon  of   young  professionals  joining  larger 
partnerships   before  establishing  own  firms   can  be  seen  as   a  way  of 
protecting  oneself  against  overly  strong  reactions   by  the  market  if  mistakes 
happen  in  the  early  career. 

Another   paper  that   elaborates   on  this  simple  learning  model   is  Aron 
(1984).      She  uses   the  learning  effects   to  derive   a  number  of   implications 
concerning  the   correlation  between  the  growth  rate  of  firms,   the   degree  of 
diversification  within  firms  and  the  size  of  firms. 

While  the   example  supported  the   common  intuition  that  incentive 
problems  are  alleviated  by  long-term   considerations,   it  is  important  to 
stress   that   this   is   by  no  means   true   universally.      In  fact,   career   concerns 
can  themselves    be   a  source  of    incentive   problems.      For    instance  in  Holmstrom 
and  Ricart-Costa    (198^)    a  model    is   analyzed   in  which   incongruities    in  risk- 
taking  between  managers    and  shareholders    arise  purely  because  of   reputation 
effects.      The  reason  is   that  managers   look  upon  investments   as   experiments 
that   reveal   information  about  their  competence,   while  shareholders   of   course 
view  them   in  terms   of   financial   returns.      The  main  point   is   that   there  is.  no 
reason  for   a  project's   human   capital   return  to  be    closely  aligned  with  its 
financial   return,   hence  the   problem  requires    explicit   incentive   alignment. 
For  those  who   distrust  incentive  models   that   rely  on  effort   aversion,  such  a 
model    provides    a  new   channel    for   analyzing  managerial   risk-taking 

incentives . 
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Flnally,  we  want   to  mention  the  work  of  Murphy    (198A)    as    an  example  of 
how  dynamics   can  help  discriminate  between  competing  theories   of 
compensation.      Murphy  compares    two  hypotheses   for  why  age-earnings    profiles 
tend  to  be  upward  sloping.      One  is  the  earlier  mentioned  model   by  Lazear. 
The  other  theory  suggests   that  the   upward  slope   comes  from  learning  about 
productivity  and  the  contracting  process   associated  with  insurance  against 
that   risk    (see   e.g.    Harris   and  Holmstrom    (1982)).      Murphy  argues   that   if   the 
incentive  hypotheses   were  true  then  the   variance  in  individual   earnings 
should  increase  with  tenure  because  of   income  smoothing.      The  reverse  should 
be  true  if   the  learning  hypothesis  held,   because  then  the  effects   of 
performance   information  are  strongest  in  the  early  years.     Murphy  tests 
these  competing  positions  on  panel   data  for  executive  compensation  drawn 
from  prospectuses.      His  results   are  rather  inconclusive,   perhaps   because 
both  effects   are  really  present.      But  the  main  point  is  that  in  principle 
dynamic  models   allow  discrimination  that   is  plainly  unavailable  from  single- 
period  studies . 

1.8      Summary  and  Conclusions 

Despite  the  length  of   this  section  we  have   covered  only  a  few 
dimensions   of  the  extensive  literature  on  principal -agent  models.      Before 
summing  up  we  want  to  mention  two  important   omissions.      One  is  the  lack  of 
examples   of  Hidden  Information  Models,  which  have  played  a  visible  role  in 
the  literature,   often   under   the  name  of  Mechanism  Design    (see  Harris  and 
Townsend   (1981)   and  Myerson   (1979)    for  seminal   contributions   and  Green 
(1985)    for   a  unified  look  at  the  field).     We  will   partly  make  up  for   this 
omission  in  the  next   section  where  a  model   of   hidden   information  is   analyzed 


in  connection  with  labor   contracting.      The  mechanism  design   approach  has 
been   quite  successful   in  explaining  a  range  of   institutions   that   are  beyond 
scope   for  standard  theory  and  it  has   also  offered  insights   into  normative 
problems  such  as   taxation   (Mirrlees    (1971)),   auction  design    (Harris  and 
Raviv    (1981),   Myerson    (1981)    and  Maskin  and  Riley    (198^4))    and  regulation 
(Baron  and  Myerson    (1982),    Baron  and  Besanko    (1984)    and  Laffont   and  Tirole 
(1985)).      In  the  same  way  as   the  models   we  have   discussed  here,   these  models 
are   plagued  by   an   excessive  sensitivity  to  informational   assumptions,   which 
makes   it  hard  to  go   beyond  qualitative   conclusions. 

The  other  major   onission  is   that   we  have  not   discussed  at   all    general 
equilibrium   effects  from   contracting,   an  area  in  which  Stiglitz  has   been 
particularly  active.      As  Stiglitz  has   noted  in  a  variety  of   different 
contexts    (see  e.g.    Arnott   and  Stiglitz   (1985)),   the  imperfections   of  second- 
best  contracts   will   have  external   effects   that  may  be  important.     The 
general   idea,   familiar  from  second-best  theory,   can  be  described  as  follows. 
In  all   economies   contracting  between  two  parties   will   have  some  equilibrium 
effect   on  the  rest   of   the   economy.      However,    in  the   idealized  Arrow-Debreu 
world  equilibrium   occurs    at   a  social    optimum   and  so  the   impact   of   marginal 
changes   in  a  bilateral    contract   will   have   zero  social    costs.      In   contrast, 
when  we  are  in  a  second-best  world    (for  whatever  reason),   marginal    changes 
in  contracts   will   have   a  first-order   effect   on  the  social   welfare  function, 

1  3 
which   is   not   accounted  for   by  the   contracting   parties..         Perhaps    one 

relevant   example  would   be   the   consequences   of   nominal    contracts   in  one  part 

of   the   economy  on  the   use  of    indexed  contracts    in  other   parts. 

Naturally,   such  externalities   could   give  reason  for    government 

intervention.      However,    one  should   be   careful    in  making  sure  that  there  is 

an   improving   policy  that    acts  solely  on  information  that   the   government   has 
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available.      As   a  modeller  it  is  easy  to  spot   improvements,   because  the 
modeller  sees   all    the  relevant   information.      But   that   does   not   imply 
automatically  that  the   government   can   improve  things,   particularly  if   the 
more  stringent   notions   of   efficiency  that   are  associated  with  incomplete 
information  models   are  applied-      Operational   welfare  schemes    in  this   sense 
seem  to  have   been  little  explored  in  the  literature  to  date. 

Now  to  the   Summary: 

(1)  In  reduced  form  all   agency  models  have  the   agent   choose  from  a 
family  of   distributions   over   observable  variables,  such  as   output.      A  key 
simplification  in  Hidden  Action  Models  is   to  assume  that  the   agent  controls 
a  one- dimensional   family  of   distributions.     This  leads   to  a  simple  and 
intuitive   characterization  of   an  optimal   scheme.      One-dimensionality  does 
not  refer  here  to  any  economic  variable,   like  effort,   but  to  the  set   of 
distributions   that  the   agent   can  choose  from.     Understanding  this   is 
important  for  resolving  the   confusions   associated  with  the   validity  of   the 
characterization  of   the  optimal   rule,   sometimes    (but  misleadingly)   referred 
to  as    the   validity  of   the  first-order   approach. 

(2)  The  main  insight   from   the   basic  Hidden  Action  Model    is   that   the 
optimal    incentive  scheme  looks   like  one  based  on  an   inference   about   the 
agent's   action  from  observable  signals.      This  implies   that   the  optimal 
scheme  is  highly  sensitive  to  the   information  content   of   the  technology  that 
the  agent   controls,  which  has   only  loose  ties   with  the   physical   properties 
of   that   technology.      Consequently,   fiddling  with  the   information  technology 
will    accommodate  almost   any  form   of    incentive  schedule   and  the   theory  is 
really  without   predictive   content   in  this   regard. 
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vniat   does   have  some  predictive  content,  however,   is   the  result  that   a 
contract  should  use  all   relevant   information  that   is   available  up  to  a 
sufficient  statistic.      Among  other  things,   it  leads   to  statements   about  the 
use  of   relative   performance   evaluation,   which  seem   to  match  empirical 
evidence  at  least  broadly. 

(3)      However,   the  extreme  sensitivity  to  informational   variables   that 
comes   across   from   this   type  of  modelling  is   at   odds   with  reality.      Real 
world  schemes  seem   to  be  simpler   than  the  theory  would  dictate  and 
surprisingly  uniform  across   a  wide  range  of   circumstances    (e.g.   linear 
schemes   are  quite  common  in  a  variety  of   situations).      The   conclusion  is 
that  something  else  than  informational   issues   drives   whatever  regularities 
one  might   observe.      One  possibility  that  has  recently  been  suggested  is  that 
the  usual   agency  models   are  overly  simplistic  and  fail   to  accoimt  for  the 
need  to  have  schemes   that  perform  well   in  a  variety  of   circumstances,   i.e. 
schemes   that   are  robust.     We  gave  one  example  of   a  model    in  which  robustness 
issues   lead  to  linear  schemes.      It  seems   that  research   in  this   direction 
could   have   high   payoffs   in  the   future. 

Another  reason  why  schemes    in  reality  are  simpler    and  less   sensitive  to 
environmental    differences   is   that    exotic   contracts   are  hard  to  evaluate  both 
in  terms  of   their   implied  performance   and  their   value  for  the   parties 
involved.      This   is  not  something  which  we  addressed  because  it  seems  to  fall 
outside   the   common  Bayesian  paradigm,   but   that   is  not   to  say   it  is 
unimportant.      Research  along  these  lines  may  also  have  high  payoffs. 

(M)      The   common  Hidden   Action  Models   are  rather   weak  predictively.      One 
reason  is   that    complex  incentive  schemes  make   it   hard  to  say  anything  about 
distributional    choices.      The  other  reason  is   that   the   actions    in  the  model 
are  not   observable   economic   variables.      (In  this   regard  Hidden  Information 


Models   are  more  useful,   because  actions   are  usually  observable,   e.g.    levels 
of   investment   or   employment;    see  the  next   part.)      Modelling  efforts   should 
be   directed  more  towards   including  interesting  economic   quantities    that 
focus    on  allooational    consequences    of   agency.      Robustness    arguments   that 
predict  simpler  schemes   should  be  helpful   in  this   endeavor   as   we  indicated 
in  section  1.5 

(5)      Another   useful   direction  for  sharpening  predictions   from   agency 
models  is  to  go  to  dynamic  formulations.      These  bring  to  bear  time  series 
and  panel   data  that   allow  discriminations   that  are  impossible  to  make  in 
static  models.     Dynamic  models   also  bring  attention  to  reputation  effects 
and  long-term   explicit   and  implicit   contracting  that  may  well   be   at  the 
center   of  real  world  incentive  problems. 
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Part  II:   Labor  Contracts 

One  of  the  first  applications  of  contract  theory  was  to  the  case  of 
contracts  between  firms  and  workers  (the  seminal  papers  are  by  Azariadis 
(1975),  Bally  (1974)  and  Gordon  (1974)).   Part  II  is  concerned  with  this  work 
and  various  extensions,  including  the  introduction  of  asymmetric  information 
and  macroeconomic  applications.   We  begin  with  the  Azariadis-Baily-Gordon 
model  itself  (for  an  excellent  recent  survey  of  labor  contract  theory  with  a 
rather  different  focus  from  the  present  one,  see  Rosen  (1985)). 

II .1   The  Azariadis-Baily-Gordon  (AEG)  Model 

The  ABG  model  was  developed  to  explain  nonWalrasiein  employment 
decisions,  particularly  layoffs,  and  to  understand  deviations  between  wages 
and  the  marginal  product  of  labor.   It  is  based  on  the  idea  that  a  firm  offers 
its  risk-averse  workers  wage  and  employment  insurance  via  a  long-term 
contract. 

The  model  can  be  described  as  follows.   Imagine  a  single  firm  that  has  a 
long-term  relationship  with  a  group  of  workers.    Presumably  a  lock-in  effect 
of  some  sort  explains  why  the  relationship  should  be  long-term,  although  this 
is  not  modelled  explicitly.   To  simplify,  assume  that  the  relationship  lasts 
two  periods.   At  date  0,  the  firm  and  workers  sign  a  contract  while  employment 
and  production  occur  at  date  1.    ABG  stress  the  idea  of  an  implicit  contract; 
we  postpone  discussion  of  this  until  Part  III. 4  and  rely  on  the  contract  being 
explicit  and  legally  binding. 

Let  the  firm's  date  1  revenue  be  f(s,L),  where  s  represents  an  exogenous 

demand  or  supply  shock,  and  L  is  total  employment  at  date  1.   Assume  that  the 

2 
date  0  workforce  consists  of  m  identical  workers,  where  m  is  given.    Each 
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worker  has  an  (indirect)  von  Neumann-Morgenstern  utility  function  U(I,il;p), 
where  I  represents  income  or  wages  received  from  the  firm,  S.    is  employment  in 

the  firm,  and  p  refers  to  a  vector  of  consumption  goods  prices.   We  shall 

3 
suppose  that  p  is  constant  and  therefore  suppress  it  in  what  follows.    We 

assume  that  U_>0,  U„<0  and  U  is  concave  in  I  and  A  with  U^^<0,  i.e.  workers 

I    a  II 

are  risk  averse.   The  firm  in  contrast  is  supposed  to  be  risk  neutral.   We 

shall  assume  that  A  is  a  continuous  variable,  in  contrast  to  ABG  who  suppose 

that  it  equals  0  or  1 . 

In  the  ABG  model,  the  state  s  is  taken  to  be  publically  observable  at 

date  1,  although  unknown  to  both  parties  at  date  0.   In  this  case,  a  contract 

can  be  contingent  in  the  sense  of  making  I  and  Jl  functions  of  s:  I  =  I(s),  S.   = 

A(s).   Since  0.   is  smooth  and  U  is  concave  in  Jl ,  it  is  desirable  to  have  work 

sharing  at  date  1;  i.e.  Jl(s)  =  {L(s)/m)  (so  this  version  of   ABG  does  not 

explain  layoffs;  see,  however,  II.4B).   Therefore  an  optimal  date  0  contract 

solves : 


(2.1)  Max   E   [f(s,mJl(s))  -  ml(s)] 
S.T.   E^  [U(I(s),  Jl(s))]  >  U, 


where  both  expectations  are  taken  with  respect  to  the  objective  probability 
distribution  of  s,  which  is  assumed  to  be  common  knowledge  at  date  0.   We  are 
adopting  the  assumption  that  the  firm  gets  all  the  surplus  from  the  contract 
while  the  workers  are  held  down  to  their  date  0  reservation  expected  utility 
levels  U.   Nothing  that  follows  depends  on  this  ex-ante  division  of  the 
surplus,  however. 

The  solution  to  (2.1)  is  very  simple.   Under  the  usual  interiority 
assumptions,  it  is  characterized  by 
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(2.2)    ~   (s,mA(s))  =  -  (|^  (I(s).  a(s))  /  ~  (I{s),  A(s)))  for  all  s. 


(2.3)    aU   (I(s),  a(s))  =  X  for  all  s, 
ai 


(2.4)    EgU(I(s).  ft(s))  =  U, 

where  X  is  a  Lagrange  multiplier.   (2.2)  tells  us  that  the  marginal  rate  of 
substitution  between  consumption  and  employment  equals  the  marginal  rate  of 
transformation  in  each  state.   (2.3)  tells  us  that  a  worker's  marginal  utility 
of  income  is  constant  across  states.   It  is  the  condition  for  optimal 
insurance  between  a  risk  averse  agent  and  a  risk  neutral  agent.   (Note  that 
(2.3)  implies  that  if  il(s  )  =  Jl(s  ),  then  I(s  )  =  I(s  ),  i.e.  wages  vary  only 
if  employment  does.) 

Several  observations  can  be  made.   First,  it  follows  from  (2.2)  that 
employment  decisions  will  be  ex-post  pareto-ef f icient  in  each  state.   Hence  to 
emphasize  what  is  by  now  well  known,  the  ABG  model  does  not  explain 
inefficient  employment  levels.   Although  there  was  some  initial  confusion 
about  this  result,  it  is  not  exactly  surprising  given  that  an  ex-ante  optimal 
contract  should  exploit  all  the  gains  from  trade  ex-post  (under  symmetric 
information).   Employment  levels,  however,  although  efficient,  are  not 
generally  the  same  as  in  a  standard  Walrasian  spot  market  where  the  wage  w(s) 
in  state  s  satisfies 


2.5)    ~  (s,  mil(s))  =  w(s)  =  -~  (w(s)fi(s).  ^(s))/^^  (w(s)A(s).  a(s)). 


The  point  is  that  the  possibility  of  income  transfers  across  states  permits  a 

df 

divergence  between  (I(s)/ft(s))  and  w(s)  =  --  (s,  mfi ( s ) ) .   In  fact,  if  labor  is 

oL 

a  normal  good  and  the  Walrasian  labor  supply  curve  is  upward  sloping,  Rosen 
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(1985)  has  pointed  out  that  employment  will  generally  vary  more  in  a 

4 
contractual  setting  than  in  a  Walrasian  spot  market. 

An  important  special  case  is  where  labor  causes  no  disutility  for  a 

worker  per  se,  but  simply  deprives  him  of  outside  earning  opportunities  at 

date  1.   This  can  be  represented  by 

(2.6)    U(I,Ji)  =  U(I+R(£-Jl))  , 

where  S.    is  the  worker's  total  endowment  of  labor  and  R  is  the  wage  in 
alternative  date  1  employment  (in  (2.6),  labor  is  neither  normal  nor 
inferior).   (2.2)  and  (2.5)  both  then  become 


(2.7)    ~  (s,  mA(s))  =  R; 


that  is,  employment  levels  will  be  exactly  the  same  in  a  contract  as  in  a 
Walrasian  spot  market.   (2.3),  on  the  other  hand,  implies  that 

(2.8)    I(s)  +  R(A-Jl(s))  =  a  constant. 

That  is,  optimal  insurance  leads  to  the  equalization  of  a  worker's  (real) 
income  across  states  of  the  world  (relative  to  the  prices  p),  a  very  different 
outcome  from  what  one  would  see  in  a  spot  market. 

The  ability  to  explain  the  divergence  between  workers'  wages  and  their 
marginal  (revenue)  product  of  labor  is  the  principal  achievement  of  the  ABG 
model.   In  fact  the  model  provides  a  striking  explanation  of  sticky  (real) 

wages  or  incomes,  which  is  in  notable  contrast  to  that  provided  by,  say, 

5 
disequilibrium  theory. 

Let  us  examine  the  underlying  assumptions  of  the  ABG  model .   A  key 
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assumption  is  that  firms  are  less  risk-averse  than  workers,  and  are  therefore 
prepared  to  act  as  insurers.   To  the  extent  that  the  shock  s  is  idiosyncratic 
to  the  firm  (we  have  essentially  assumed  this  anyway  in  regarding  goods  prices 
p  as  independent  of  s),  this  is  reasonable  since  it  is  probably  easier  for  a 
firm's  owners  to  diversify  away  idiosyncratic  profit  risk  via  the  stock  market 
than  it  is  for  workers  to  diversify  away  human  capital  risk.   However,  the 
assumption  is  less  convincing  in  a  macroeconomic  setting  where  firms'  shocks 
are  correlated. 

Even  when  the  shock  is  idiosyncratic,  it  is  not  obvious  that  a  worker 
must  look  to  his  own  firm  for  insurance.   Why  not  go  to  an  insurance  company? 
In  the  ABG  world,  where  s  is  publically  observable,  there  would  seem  to  be  no 
difficulty  in  making  payments  to  and  from  the  insurance  company  conditional  on 
s.   However,  if  the  model  is  complicated,  some  justification  for  the  firm  as 
insurer  can  be  given. 

First,  it  may  be  the  case  that,  while  s  is  observable  to  the  firm  and 
workers,  it  is  not  observable  to  the  insurance  company.   If  the  insurance 
company  relies  on  a  worker  to  report  s,  the  worker  will,  of  course,  have  an 
incentive  to  announce  an  s  that  maximizes  his  transfer  from  the  insurance 
company.   Now  it  is  possible  that  the  insurance  company  can  learn  s  by  getting 
independent  reports  on  it  from  the  firm  and  the  workers,  but  there  is  the 
danger  that  the  firm  and  workers  may  collude.   The  whole  process  may  involve 
considerable  costs  relative  to  the  case  of  insurance  by  the  firm. 

In  fact,  to  provide  optimal  insurance,  it  is  not  necessary  that  the 
insurance  company  observe  s,  only  that  it  observe  wages  I(s)  and  employment 
il(s).   However,  even  if  it  can  observe  these  variables,  new  problems  arise  if 
some  aspect  of  a  worker's  performance  is  unobservable  to  the  insurance 
company.   For  example,  suppose  that  to  make  employment  productive  it  is 
necessary  that  a  worker  exert  effort,  e.   Then  the  optimal  risk  sharing 


58 

contract  would  insure  a  worker's  wage  subject  to  the  worker  exerting  effort. 
If  the  insurance  company,  which  cannot  observe  s  or  e ,  offers  insurance,  the 
worker  may  exert  no  effort  and  claim  that  his  low  wage  was  a  result  of  a  bad 
s.  Again  this  problem  is  reduced  if  the  firm,  which  does  observe  e,  acts  as 
insurer. 

The  reader  may  wonder  how  if  s  and  e  are  not  observable  to  outsiders 
such  as  insurance  companies,  a  contract  between  the  firm  and  workers  making  I 
and  S.   functions  of  s  and  e  can  be  enforced.   This  is  an  important  question,  to 
which  two  answers  can  be  given.   First,  it  may  be  the  case  that  the  firm  and 
workers  each  have  enough  evidence  to  establish  to  an  outsider  what  s  and  e 
really  are,  i.e.  in  the  event  of  a  dispute  between  them  "the  truth  will  come 
out"  (whereas  in  a  three  party  contract  involving  an  insurance  company, 
collusion  between  the  firm  and  the  workers  may  prevent  this).   Secondly,  if 
the  contract  is  implicit  rather  them  explicit,  then  it  may  be  enforced  by 
reputational  considerations;  that  is,  the  firm  will  not  deny  that  the  worker 
exerted  effort  if  he  really  did  since  this  would  ruin  its  reputation  with 
future  workers  (for  more  on  this,  see  Part  III. 4). 

II. 2   The  Possibility  of  Worker  Quits 

The  ABG  model  is  based  on  the  idea  that  firms  insure  workers  against 
fluctuations  in  their  real  income.   This  means  that  workers  will  receive  more 
than  their  marginal  (revenue)  product  in  some  states  and  less  in  others.   A 
difficulty  that  has  been  raised  with  this  is  that  a  worker  may  quit  in  the 
latter  states,  i.e.  simply  walk  away  from  the  contract.   This  will,  of  course, 
only  be  a  problem  if  the  worker's  marginal  product  outside  the  firm  is 
comparable  to  that  inside,  i.e.  if  the  lock-in  effect  that  is  responsible  for 
the  long-term  relationship  in  the  first  place  is  small.   If  it  is  small. 
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however,  the  insurance  element  of  the  contract  will  be  put  under  severe 
pressure . 

To  see  this,  suppose  that  there  is  a  single  worker  {m=l)  who  can  work 
either  in  the  firm  (L=l)  or  outside  {L=0).   To  simplify,  assume  that  the 
worker's  marginal  (equals  average)  product,  denoted  by  s ,  is  the  same  inside 
and  outside  the  firm  (i.e.  there  is  no  lock-in  at  all),  and  that  the  worker 
cares  only  about  total  income:   U  =  U(I)  (as  in  (2.6)).   Then  in  order  to  stop 
the  worker  quitting  at  date  1,  the  firm  must  pay  him  at  least  s  in  every 
state.   However,  in  order  to  break  even  on  the  worker,  the  firm  cannot  pay  him 
more  than  s.   The  conclusion  is  that  the  firm  will  pay  the  worker  exactly  his 
marginal  product  in  each  state,  which  is,  of  course,  the  spot  market  solution. 

In  this  extreme  case  of  no  lock-in,  then,  the  insurance  element  is 

completely  destroyed.   Holmstrom  (1983)  has  argued  that  this  conclusion  is  no 

longer  valid  when  employment  and  production  take  place  at  more  than  one  date. 

The  argument  is  the  following.   In  the  above  example,  the  firm  could  provide 

complete  insurance  at  date  1  and  at  the  same  time  avoid  quits  by  agreeing  to 

pay  the  worker  s  =  Max  s  in  every  state.   Of  course,  the  firm  makes  a  loss  on 

this,  but  if  the  worker  also  has  a  nonstochastic  productivity  s   at  date  0, 

the  firm  can  offset  this  loss  by  paying  the  worker  less  than  s   at  date  0. 

There  is  a  cost  of  doing  this,  since,  assuming  that  the  worker  cannot  borrow, 

his  consumption  path  will  be  more  steeply  sloped  over  time  than  he  would  like. 

(If  the  worker's  utility  function  is  U(I-)  +  5  U(I^)  where  (^  -  1)  is  the 

0  1  o 

market  rate  of  interest,  the  first-best  contract  would  have  I   =  I..(s)  =  I  say 
for  all  s,  i.e.  complete  income  smoothing.)   Holmstrom  shows  that  if  this  cost 
is  traded  off  optimally  against  the  insurance  benefit,  the  outcome  is 
incomplete  insurance  of  the  following  sort:   the  firm  puts  a  floor  on  date  1 
income  by  guaranteeing  the  worker  at  least  s  <  s;  however,  in  states  where  s  > 
s,  the  firm  agrees  to  pay  the  worker  his  full  marginal  product,  s. 
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One  benefit  of  the  Holmstrom  model  is  that  it  provides  an  explanation  of 

the  back-end  loading  of  earnings  (the  worker  gets  less  than  his  marginal 

7 
product  at  date  0  and  at  least  his  marginal  product  at  date  1).    However,  the 

model  is  based  on  a  number  of  fairly  strong  assumptions.   First,  it  is 

supposed  that,  while  the  firm  is  bound  to  the  contract,  the  worker  can  simply 

walk  away.   One  may  ask  why  the  contract  cannot  specify  either  that  a  worker 

cannot  quit  at  all  or,  less  extremely,  that  a  quitting  worker  must  compensate 

the  firm  by  paying  an  "exit  fee".   In  answering  this  question  some  people  have 

appealed  to  the  idea  that  the  courts  will  not  enforce  involuntary  servitude  of 

this  sort  (although  note  that  we  are  really  talking  about  voluntary  servitude 

since  the  worker  presumably  agrees  to  the  contract  at  date  0).   While  this  may 

have  been  the  case  historically,  it  is  interesting  to  note  that  attitudes  seem 

to  be  changing;  the  use  of  exit  fees  (e.g.  repayment  of  training  or 

transportation  costs  by  leaving  workers)  seems  to  be  on  the  increase  with 

recent  indications  being  that  the  courts  are  prepared  to  enforce  them  (New 

York  Times,  October  30,  1985).   In  particular,  there  seems  to  be  a  move  to 

apply  to  labor  contracts  the  basic  principle  of  common  law  that  the  victim  of 

a  breach  of  contract  is  entitled  to  compensatory  damages,  i.e.  to  be  put  in  as 

good  a  position  as  if  the  breach  had  not  occurred.   In  the  Holmstrom  model, 

compensatory  damages  correspond  to  the  quitting  worker  paying  the  firm  s-I(s) 

in  state  s,  where  I(s)  is  his  wage  if  employed  by  the  firm.   In  this  case, 

however,  the  worker  never  desires  to  quit  and  the  first-best  I   =  I  (s)  =  I 

can  be  achieved. 

The  Holmstrom  model  also  assumes,  like  the  ABG  model,  that  the  firm  must 

provide  workers'  insurance.   We  have  given  some  justifications  for  this  above, 

but  they  become  less  plausible  when  the  lock-in  effect  is  small.   The  reason 

is  that,  if  s  is  the  amount  that  the  worker  can  earn  inside  or  outside  the 

firm,  the  assumption  that  an  insurance  company  cannot  observe  s  is  perhaps 
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less  convincing  (although  there  may  still  be  problems  in  enforcing  a  contract 
based  on  s  if  s  isn't  "verifiable";  see  Part  III).   Even  if  outsiders  cannot 
observe  s,  the  worker  could  still  rely  on  the  firm  for  insurance,  but  borrow  a 
fixed  amount  from  a  bank  which  the  worker  would  deposit  with  the  firm, 
receiving  it  back  only  if  he  did  not  quit  (i.e.  the  worker  could  post  a  bond). 
Such  an  arrangement  would  again  achieve  the  first-best,  although  it  may  of 

course  stretch  to  the  limit  the  assumption  that  the  firm  will  not. default  on 

9 
its  part  of  the  contract.    (Note  that  this  arrangement  does  involve  a  form  of 

back-loading. ) 

One  case  where  these  criticisms  do  not  apply  is  where  the  worker  can 

simply  "disappear".   If  this  is  so,  then  the  firm  knows  that  it  will  never  be 

able  to  collect  any  exit  fee  and  no  bank  will  be  prepared  to  lend  to  the 

worker.   Another  reason  for  the  absence  of  exit  fees  or  bond  posting  is  that 

the  worker  may  sometimes  quit  for  other  reasons  than  a  high  alternative  wage; 

e.g.  work  in  the  firm  may  become  intolerable  or  the  worker  may  become  sick. 

These  states  are  likely  to  be  bad  for  the  worker  and  so  for  reasons  of  risk 

aversion  he  will  be  unwilling  to  forfeit  a  substantial  amount  in  them  (we  are 

assuming  that  the  reason  for  quitting  is  not  publically  observable  and  so  the 

exit  fee  cannot  be  made  contingent  on  it).   Considerations  like  these  seem 

likely  to  lead  to  a  not  insignificant  complication  of  the  model,  however,  and 

it  is  unclear  how  robust  the  back-end  loading  result  is  to  their  introduction. 

II .3   Asymmetric  Information 

Let  us  return  to  the  case  where  all  parties  to  the  contract  are  bound. 
As  we  have  seen,  the  ABG  model  can  explain  sticky  (real)  wages  or  incomes,  but 
not  ex-post  inefficient  employment.   Because  of  this,  various  attempts  have 
been  made  to  enrich  the  ABG  model.   An  important  development  has  been  the 
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introduction  of  asymmetric  information.   The  first  set  of  models  along  these 
lines  considered  the  case  where  the  firm's  revenue  shock  s  is  observed  only  by 
the  firm  at  date  1  (see  Calvo  -  Phelps  (1977)  and  Hall  -  Lilien  (1977)).   This 
"hidden  information"  assumption,  as  Kenneth  Arrow  has  termed  it,  has  force 
when  the  party  with  private  information  is  risk-averse.     It  is  this 
supposition  which  underlies  the  models  of  Azariadis  (1983)  and  Grossman  -  Hart 
(1981,1983):   the  firm  is  identified  with  its  risk  averse  manager. 

A  consequence  of  managerial  risk  aversion  is  that  it  is  no  longer 
optimal  for  the  firm  to  provide  workers  with  complete  income  insurance  as  in 
the  basic  ABG  model;  rather  the  manager  will  now  want  to  obtain  some  insurance 
himself.   The  manager's  ability  to  obtain  insurance,  however,  is  limited  by 
his  private  information.   For  example,  an  insurance  contract  which  pays  the 
manager  $  a  >  0  in  state  one  and  taxes  the  manager  $  p  >  0  in  state  two  cannot 
be  implemented  if  the  insurer  must  rely  on  the  manager  to  report  which  of  the 
two  states  has  occurred  (the  manager  will  always  report  state  one).   However, 
the  manager's  incentive  to  report  the  wrong  state  can  be  lessened  by 
introducing  a  production  inefficiency:   the  manager  will  be  less  inclined  to 
report  state  one  if,  as  well  as  receiving  $  a,  he  must  choose  a  production 
plan  which  is  inefficient,  and  moreover  is  relatively  unprofitable  if  the  true 
state  is  indeed  two.   We  shall  see  that  the  second-best  optimal  insurance 
contract  includes  production  inefficiencies  of  this  kind;  furthermore,  under 
certain  conditions  the  inefficiencies  take  the  form  of  underemployment  of 
labor  in  bad  states  of  the  world. 

It  turns  out  that  the  case  where  the  manager  and  workers  must  provide 
each  other  with  insurance  is  quite  complicated  to  analyze.   A  considerable 
simplification  is  possible,  however,  if  it  is  supposed  that  each  group  can  get 
insurance  from  a  risk  neutral  third  party  (this  is,  of  course,  a  departure 
from  the  idea  that  the  firm  has  a  comparative  advantage  in  insuring  the 
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workers;  or  vice  versa).   While  the  existence  of  such  a  third  party  may  at 
first  sight  seem  farfetched,  it  can  be  argued  that  in  the  case  of  a  public 
company  the  firm's  shareholders  play  this  role,  acting  as  a  financial  wedge 
between  the  manager  and  the  workers  (moreover,  risk  neutrality  of  the 
shareholders  may  be  reasonable  to  the  extent  that  they  hold  well-diversified 
portfolios)  . 

If  workers  can  get  insurance  from  a  third  party,  the  long-term  contract 
between  the  firm  and  workers  becomes  much  less  important,  and  in  fact  a  simple 
case  (which  we  follow)  is  where  this  contract  is  ignored  altogether,  with  the 
firm  being  assumed  to  make  all  input  purchases  in  the  date  1  spot  market. 
That  is,  we  now  focus  on  a  risk-averse  manager  with  private  information,  who 
insures  himself  with  a  risk  neutral  third  party,  and  buys  all  his  inputs  in 
the  date  1  spot  market.   (Below  we  discuss  the  implications  of  putting  the 
worker-firm  contract  back  into  the  analysis,  particularly  when  the  firm  has  a 
comparative  advantage  in  insuring  the  workers . ) 

The  main  implications  of  asymmetric  information  can  be  understood  from 
the  special  case  where  there  are  only  two  states  of  the  world  (we  follow 
Holmstrom-Weiss  (1985)).   We  now  interpret  f  to  be  the  manager's  benefit 
function  (measured  in  dollars).   This  benefit  is  supposed  to  be  private,  i.e. 
it  does  not  show  up  in  the  firm's  accounts  and  so  payments  cannot  be 
conditioned  on  it.   We  write  f  =  f(s,  L),  where  we  are  now  more  general  in 
allowing  L  >  0  to  be  a  vector  of  inputs  or  managerial  decisions.   It  is 
assumed  that,  while  s  is  observed  at  date  1  only  by  the  manager,  L,  which  is 
chosen  after  s  is  observed,  is  publically  observable.   It  is  in  fact 
convenient  to  regard  f  as  the  manager's  net  benefit  in  state  s  after  all 
inputs  have  been  purchased  in  the  date  1  spot  market. 

Let  the  two  states  be  s  =  s  ,  s   with  probabilities  tt  ,  tt  respectively 
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(tt  ,  TT  >  0,  TT  +  TT  =  1).   The  manager  signs  a  contract  with  a  risk  neutral 

J.     ^  JL      ^ 

third  party.   The  contract  says  that  in  state  s.,  i  =  1,2,  the  third  party 
will  pay  the  manager  I.  and  the  manager  must  choose  L..   An  optimal  contract 
solves : 


(2.9)    Max  TT^  V(f{s2.  L^)  +  I2)  +  ""^   V{f(s^,  L^ )  +  I^) 


S.T.   fCs^.  L^)  -  I2  ^  f(s2-  Li)  ^  ^1 


f(s^,  L^)  -    1^   >   f(s^.  L^)  -  I^. 


TT    I    +  TT    I    <  0, 

2   2     1   1  - 


Here  V  is  the  manager's  von  Neumann-Morgenstern  utility  function,  where  V  > 
0,  V"  <  0.   The  third  constraint  says  that  the  third  party  is  prepared  to 
participate  in  the  contract  (we  give  the  firm  all  the  surplus  from  the 
transaction).   The  first  and  second  constraints  are  the,  by  now  well  known, 
truth-telling  constraints  (see,  e.g.,  Myerson  (1979)).   Since  the  third  party 
cannot  observe  s  directly  it  must  rely  on  the  manager  to  report  s. 
Constraints  1  and  2  say  that  the  manager  will  report  s  =  s   when  s   occurs  and 
s  =  s  when  s  occurs. 

Another  interpretation  of  the  contract  is  that,  instead  of  asking  him  to 
report  s,  the  contract  gives  the  manager  the  choice  of  the  pairs  (I  ,  L  )  and 
(I  ,  Lp)-   The  first  and  second  constraints  then  say  that  the  manager  will 
choose  (I.,  L.)  in  state  s.. 

We  shall  assume  that  s   is  the  good  state  and  s   the  bad  state,  in  the 
sense  that  total  benefits  are  higher  in  s   than  in  s  : 
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(2.10)  ^^^2'  h^  ~  ^^^1'  h^  ^°^  ^■^■^  h-°'  ^^'^^ 

strict  inequality  if  L  ?:  0. 


We  suppose  also  that: 

(2.11)    f(s,  L)  is  strictly  concave  in  L,  and  the  (unique)  maximizer 
L(s)  of  f(s,  L)  exists  and  satisfies  L(s)  ?:  0  and 
hi*   ^  L(s^)  ;.  L(S2)  H  L^*. 

That  is,  the  maximizer  of  f(s,  L)  is  sensitive  to  s.   (2.10)  and  (2.11)  imply 
immediately  that  the  first-best  —  the  solution  to  (2.9)  without  the  truth- 
telling  constraints  —  cannot  be  achieved  under  asymmetric  information.   The 
first-best  has  the  property  that  L  is  chosen  to  maximize  f  in  each  state  and 
the  manager  is  perfectly  insured,  i.e. 


(2.12)    f(s2.  L^)  +  I2  =  f(s^.  L^)  +  I^, 

where      "2  ^2  "^  "l  ^1  "  ° '  ^"^^  ~2  "^2*'  ~1  "  ^l' 


13 
But,  given  ( 2 . 10 ) -( 2 . 11 ) ,  this  violates  the  first  truth-telling  constraint. 

This  observation  suggests  that  only  the  first  truth-telling  constraint  will  be 

binding  in  the  solution  to  the  second-best.   This  turns  out  to  be  true,  as  is 

proved  in  the  Appendix  (it  is  interesting  to  note  that  in  the  two  state  case 

we  can  establish  this  even  in  the  absence  of  a  Spencian  single  crossing 

property  on  marginal  benefit) .   It  follows  that  the  first-order  conditions  for 

(2.9)  are: 
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(2.13)    TT^  V'^  -  y  TT^  -  K  =  0. 

k 

where  V.  =   V{f{s,,  L.)  +  I.),  and  similarly  V'.;  f.=f(s.,L.);X>Ois  the 
1        i~i     1  11      i~i 

Lagrange  multiplier  for  the  first  constraint,  y  >  0  for  the  third,  and  the 
second  constraint  has  a  zero  multiplier.   In  fact,  X  >  0  since  X  =  0  gives  us 
the  first-best,  which  we  know  violates  constraint  1. 
From  the  second  equation  in  (2.13),  we  see  that 


(2.14)  af   (s^,  L^)  =  0  for  all  k,  i.e.  L^  =  L^*, 
'^k      "  '    ' 

while  the  third  and  fourth  equations  imply  that 

(2.15)  MTT   3f   (s  ,  L  )  +  X  (af  (s  ,  L  )  -  af  (s  ,  L  ))  =  0   for  all  k. 

aL,      -       aL,   -^  -^    aL,   ^  "-^ 
k  k  k 

It  follows  that  af   (s^,  L  )  can  be  zero  only  if  af   (s  ,  L  )  is  also  zero, 
^^k  ^^k 

Therefore,  we  cannot  have  af   (s^,  L  )  =  0  for  all  k  since  this  would  imply 

^^k 

that  L   =  L  *  maximizes  f(s  ,  L),  which  contradicts  (2.11).   Hence  we  have 
established 


(2.16)    hi   ^  hi*' 


(2.14)  and  (2.16)  comprise  the  main  result  of  this  (two  state) 
asymmetric  information  model:   the  optimal  second-best  contract  has  efficient 


67 

production  in  the  good  state,  but  inefficient  production  in  the  bad  state. 
The  intuition  behind  this  is  that  if  production  in  the  bad  state  were 
efficient,  an  improvement  could  be  made  by  perturbing  L   slightly  so  as  to 
reduce  f{s  ,  L^ ) :  this  would  have  only  a  second  order  effect  on  f(s  ,  L  )  by 
the  envelope  theorem  but  would  relax  the  truth-telling  constraint  with  a 
positive  multiplier  (in  contrast,  perturbations  in  L   do  not  relax  this 
constraint).   In  fact,  in  general,  we  will  have  a  distortion  in  each  of  the 
firm's  input  decisions  in  the  bad  state.   For  (2.15)  tells  us  that 


(2.17)    af   (s    L  )  =  0  =>  af   (s    L  )  =  0, 

i.e.  L,  is  undistorted  in  the  second  best  only  if  its  optimal  value  doesn't 
depend  on  s .   To  put  it  another  way,  in  general,  the  manager ' s  contract  with 
the  third  party  will  constrain  every  observable  dimension  of  action  that  the 
manager  takes  in  the  bad  state. 

To  identify  the  direction  of  the  distortion  in  L,  ,  we  must  put  further 
restrictions  on  f.   Assume  that 


(2.18)  af   (s  .  L)  >  af   (s  ,  L)  for  all  L, 

i.e.  the  marginal  product  of  each  input  is  higher  in  the  good  state  for  all  L. 

Then 

(2.19)  af   (s   L  )  >  0  for  all  k, 
^^k 

since  the  bracketed  term  in  (2.15)  is  negative,  and  so  the  first  term  must  be 
positive.   (2.19)  tells  us  that  each  input  L,  is  underemployed,  given  other 
input  choices.   It  does  not  necessarily  follow  that  L   <  L  *,  although  this 
will  be  so  if  either  (i)  L  is  one  dimensional;  (ii)  L  is  two  dimensional  and 
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f    >  0;  or  (ill)  f  is  Cobb-Douglas.   In  these  cases,  we  may  conclude  (using 
also  (2.14))  that  input  use  varies  more  across  states  in  the  second-best  than 
in  the  first-best. 

Unfortunately,  these  results  do  not  generalize  easily  to  the  case  of 
more  than  two  states  (although  there  will  still  generally  be  distortions). 
The  reason  is  that  it  becomes  much  harder  to  know  which  of  the  many  truth- 
telling  constraints  will  be  binding.   One  case  where  progress  can  be  made  is 
when  there  is  only  one  input  and,  as  in  (2.18),  the  marginal  product  of  this 
input  can  be  ranked  across  all  states.   Then  only  the  downward  truth-telling 
constraints  are  binding  and  the  underemployment  result  holds.   For  a 
discussion  of  this  case,  see  Hart  (1983). 

As  we  have  noted,  the  above  model  emphasizes  the  idea  of  a  risk  averse 

14 
manager  trying  to  get  insurance  against  fluctuations  in  his  net  income.    In 

order  to  maintain  the  assumption  of  informational  asymmetry  it  must  be 

supposed  that  this  income  is  private  (it  doesn't  show  up  in  the  firm's 

accounts).   A  generalization  of  the  above  model  would  have  part  of  net  income 

observed  and  part  not;  e.g.,  f  =  f   "•■  ^o-  where  f   is  the  firm's  profit  and 

-f   represents  the  manager's  effort  cost  in  realizing  this  profit  (or  f 

represents  managerial  "perks").   The  only  difference  now  is  that  the  manager's 

insurance  payment  I  can  be  conditioned  on  f   so  that  f   becomes  like  one  of 

the  observable  inputs  L.   This  case  is  analyzed  in  Holmstrom-Weiss  (1985). 

The  above  model  completely  deemphasizes  the  role  of  the  long-term 

contract  between  the  firm  and  its  workers.   This  can  be  reintroduced  without 

significant  change  if  it  is  supposed  that  the  workers,  like  the  manager,  can 

receive  wage  insurance  from  a  third  party;  on  this,  see  Hart  (1983).   If 

however,  for  reasons  discussed  in  II. 1,  the  manager  has  a  comparative 

advantage  in  providing  insurance,  matters  become  more  complicated.   The  reason 

is  that  the  manager  is  a  "flawed"  insurer,  even  if  he  is  risk  neutral,  since 
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he  has  private  information.   As  Chari  (1983)  and  Green-Kahn  (1983)  have  shown, 
this  leads  to  a  further  distortion  in  production.   For  example,  if  U(I,Jl)  = 
a(I)  -  A,  where  A  is  employment  and  a"  <  0,  the  solution  to  (2.2)-(2.3)  has 
I(s)  =  constant,  and  S.{s)    increasing  in  s  when  the  manager  is  risk  neutral. 
This,  however  gives  the  manager  an  incentive  always  to  report  the  highest 
employment  state.   That  is,  in  the  two  state  example  of  this  section,  the 
manager  now  has  an  incentive  to  report  s^   when  the  true  state  is  s^ .   To 
overcome  this,  the  second-best  will  have  I(s)  increasing  with  il(s).   In 
addition,  in  the  two  state  example,  the  optimal  second-best  contract  will  have 
the  property  that  the  second  truth-telling  constraint  is  binding,  and  (2.2) 
holds  with  equality  in  the  bad  state  and  the  left-hand  side  of  (2.2)  is  less 
than  the  right-hand  side  in  the  good  state.   This  has  been  called 
"overemployment"  although  the  inequality  in  (2.2)  does  not  necessarily  imply 
that  A(s  )  is  higher  in  the  second-best.   This  overemployment  result  in  fact 
holds  whenever  the  manager  is  risk  neutral,  as  long  as  U(I,A)  has  the  property 
that  leisure  is  a  normal  good  (see  Chari  (1983),  Green-Kahn  (1983)). 

If  the  manager  is  risk  averse,  this  overemployment  effect  comes  into 
conflict  with  the  underemployment  effect  discussed  above.   To  put  it  another 
way,  the  manager's  desire  to  get  insurance  for  himself  comes  into  conflict 
with  his  role  as  insurer  for  the  workers.   Which  effect  "wins"  depends  in  some 
sense  on  how  risk  averse  the  manager  is  in  comparison  with  how  normal  leisure 
is  (see  Cooper  (1983)).   One  case  where  there  is  no  conflict  is  when,  as  in 
(2.6),  the  cost  of  supplying  labor  comes  entirely  from  missed  outside 
opportunities;  or  more  generally  when  U(I,Jl)  =  U(I-g(ft)).   Under  these 
conditions,  the  overemployment  effect  disappears  and  we  unambiguously  have 
underemployment  (see  Azariadis  (1983),  Grossman  and  Hart  (1981,  1983)).   As  we 
have  noted,  another  case  where  underemployment  is  the  outcome  is  when  the 
workers  can  obtain  income  insurance  elsewhere. 
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The  fact  that  some  asymmetric  information  models  predict  underemployment 
while  others  predict  overemployment  has  caused  some  to  conclude  that  this  is 
not  a  fruitful  approach  for  analyzing  employment  distortions.   This  seems 
unfortunate  for  several  reasons.   First  the  models  all  have  the  property  that 
there  is  ex-post  inefficient  employment.   This  is  of  considerable  interest 
given  that  most  neoclassical  models  of  the  labor  market  —  those,  say,  that 
treat  it  as  a  spot  market  or  analyze  wage-employment  decisions  as  a  symmetric 
information  bargaining  process  —  predict  ex-post  efficiency.   Secondly,  the 
underemployment  and  overemployment  models  may  not  be  in  quite  as  much  conflict 
as  is  sometimes  thought.   Since  one  refers  to  underemployment  in  bad  states 
and  the  other  to  overemployment  in  good  states,  both  in  fact  suggest  increased 
employment  variability  compared  to  the  spot  market  (the  word  "suggests"  is 
important  here  since,  as  we  have  noted,  "overemployment"  refers  to  the 
relative  size  of  marginal  rates  of  substitution  and  transformation  rather  than 
to  differences  in  labor).   From  a  macroeconomic  point  of  view,  this  may  be  the 
most  important  conclusion.   Thirdly,  the  question  of  whether  the 
overemployment  effect  is  likely  to  dominate  the  underemployment  effect  in  a 
particular  context  is  one  that  empirical  work  can  shed  light  on.   Most 
empirical  analyses  of  the  labor  market  find  that  participation  decisions  of 
prime  age  males  are  highly  income  inelastic  (see  Killingsworth  (1983)).   This 
suggests  that  the  normality  of  leisure  effect  is  likely  to  be  very  small  with 
respect  to  significant  employment  changes  that  are  more  than  temporary,  e.g. 
severances.   To  put  it  another  way,  outside  earning  opportunities  are  likely 
to  swamp  leisure  as  an  opportunity  cost  of  labor  in  such  cases,  which  provides 
some  support  for  the  utility  function  U(I-g(Jl))  and  for  the  underemployment 
effect.   On  the  other  hand,  the  overemployment  effect  may  be  more  relevant  in 

1  R   1  R 

the  case  of  temporary  layoffs  or  short-run  variations  in  hours. 

Finally,  mention  should  be  made  of  a  body  of  literature  that  considers 
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other  asymmetries  of  information.   Some  papers  have  analyzed  the  case  where 
workers  have  private  information  about  their  opportunity  costs  (see,  e.g., 
Kahn  (1985)  and  Moore  (1985))  while  others  have  studied  situations  where  firms 
and  workers  each  possess  some  private  information.   This  last  "two-sided"  case 
is  very  complex  and  only  limited  progress  has  so  far  been  made  in  its  analysis 
(see,  e.g.,  d ' Aspremont-Gerard-Varet  (1979),  Riordan  (1984),  and, 
particularly,  Moore  (1984)). 

II .4   Extensions  of  the  Labor  Contract  Model 

A.   Macroeconomic  Applications 

The  original  labor  contract  model  was  developed  with  em  eye  to 
macroeconomic  applications.   The  discovery  that  employment  decisions  are  ex- 
post  efficient  (and,  under  (2.6),  the  same  as  in  the  Walrasian  model)  perhaps 
dampened  enthusiasm,  but  the  advent  of  the  asymmetric  information  models  has 
stimulated  some  new  work  in  this  direction. 

A  simple  way  to  incorporate  the  model  of  II. 3  into  a  macroeconomic 
setting  is  to  suppose  that  the  economy  consists  of  many  identical  managerial 
firms,  with  perfectly  correlated  demand  or  supply  shocks  s.   Given  (2.14), 
(2.19),  this  would  seem  to  give  us  an  explanation  of  why  an  aggregate  down 
shock  would  lead  to  a  greater  fall  in  employment  in  each  firm  than  would  be 
expected  in  a  spot  market. 

Unfortunately,  this  is  too  simple.   If  all  firms  reduce  employment,  one 
would  surely  expect  this  to  be  observable  to  workers  (and  third  parties); 
moreover  since  presumably  no  firm  has  an  influence  on  aggregate  employment, 
and  aggregate  employment  is  perfectly  correlated  with  s,  the  asymmetry  of 
information  will  disappear  if  payments  are  conditioned  on  this  variable. 
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Two  ways  of  overcoming  this  problem  have  been  attempted.   One  is  to 
suppose  that  the  aggregate  shock  causes  a  change  in  the  variance  of  the 
distribution  of  s  as  well  perhaps  as  its  mean  (see  Grossman,  Hart  and  Maskin 
(1983)).   Suppose  that  there  are  two  states  of  the  economy,  one,  a  ,  in  which 
the  variance  of  s  is  very  small  and  the  other,  a    ,    in  which  it  is  large. 
Consider  the  special  case  where  the  Walrasian  aggregate  employment  levels  are 
the  same  in  a  and  o  .   Then  under  asymmetric  information  a  shock  that  moves 
the  economy  from  a   to  a  ,  even  If  it  is  publically  observed  (say,  through 
changes  in  aggregate  employment),  will  reduce  total  employment.   This  is 
because  the  asymmetry  of  information  will  be  (almost)  irrelevant  in  the  low 
variance  state  <x  =  a      (where  a  firm's  profitability  can  (essentially)  be 
deduced  from  macro  variables),  but  will  have  force  in  the  high  variance  state 
a  =  oc  .   Hence  aggregate  employment  will  be  close  to  the  Walrasian  level  in 
ct  ,  but  will  be  below  the  Walrasian  level  in  a  ;  the  latter  because  low  s 
firms  have  lower  employment  levels  under  asymmetric  information  (by  (2.19)), 
while  high  s  firms  do  not  have  higher  employment  levels  (by  (2.14)).   Together 
these  arguments  yield  the  conclusion  that  total  employment  will  be  lower  in  a 
than  in  a  under  asymmetric  information.   In  fact  the  same  logic  generalizes 
to  show  that  if  Walrasian  aggregate  employment  falls  when  the  economy  is  hit 
by  a  variance-increasing  shock,  this  fall  will  be  amplified  under  asymmetric 
information. 

Farmer  (1984)  exploits  a  similar  idea.   Suppose  that  a  publically 
observable  macroeconomic  shock  increases  the  cost  of  firms'  inputs,  e.g.  by 
raising  the  real  rate  of  interest.   Then  although  the  distribution  of  s  may 
not  change,  firms'  net  profits  fall.   If  managers  have  decreasing  absolute 
risk  aversion,  this  is  like  an  increase  in  managerial  risk  aversion.   This 
will  increase  the  distortion  found  in  low  s  firms  (which  is  a  function  of  risk 
aversion),  without  there  being  offsetting  effects  in  high  s  firms  (by  (2.14)). 
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Hence  an  aggregate  increase  in  unemployment  will  again  be  amplified  under 
asymmetric  information. 

A  second  approach  is  to  suppose  that  s  consists  of  a  component  common  to 
all  firms  and  an  idiosyncratic  component  (see  Holmstrom-Weiss  (1985)).   The 
common  component  will  presumably  again  show  up  in  the  aggregate  employment 
figures  —  and  so  wages  can  be  conditioned  on  it  —  but  suppose  these  figures 
are  published  with  a  lag  —  after  managers  learn  their  s  and  employment 
decisions  must  be  made.   A  low  s  manager  will  then  be  unsure  whether  his  is 
one  of  many  adversely  affected  firms  (i.e.,  there  has  been  an  aggregate  down 

shock)  or  whether  he  is  in  a  minority  (i.e.  he  has  had  a  bad  idiosyncratic 
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shock).     In  the  first  case,  he  will  be  able  to  reduce  the  wage  rate  (with  a 

lag),  whereas  in  the  second  case  he  won't  (it  is  not  incentive-compatible  to 

allow  a  firm  with  a  bad  idiosyncratic  shock  to  reduce  wages).   A  risk-averse 

manager  will  put  relatively  high  weight  on  the  second  possibility  and  so  will 

cut  back  on  employment  as  a  second-best  way  of  reducing  the  wage  bill.   As 

above,  this  is  not  compensated  for  by  an  increase  in  employment  in  high  s 

firms  (by  (2.14)).   The  result  can  be  shown  to  be  greater  aggregate  employment 

variability  between  economy-wide  up  shocks  and  down  shocks  than  would  occur  in 

a  spot  market. 

The  conclusion  that  aggregate  employment  levels  can  be  inefficient 

raises  the  question  of  whether  there  is  a  role  for  government  intervention. 

In  a  version  of  the  Grossman-Hart-Maskin  (1983)  model,  where  s  reflects  a 

relative  demand  shock,  it  can  be  shown  that  a  policy  which  stabilizes  demand 

across  different  firms  can  be  welfare  improving.   This  is  because  demand 

shifts  have  an  externality  effect  via  their  impact  on  the  extent  of  the 

asymmetry  of  information  between  firms  and  workers  and/or  third  parties. 

Since  externalities  like  this  seem  to  be  a  fairly  pervasive  feature  of 

asymmetric  information/moral  hazard  models  (see  Part  I),  it  seems  likely  that 
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there  will  be  a  role  for  government  intervention  in  other  models  too  (of 
course,  the  usual  qualification  that  the  government  may  require  very  good 
information  to  improve  things  should  be  borne  in  mind) .   Work  on  this  topic  is 
still  in  its  infancy,  however,  and  general  results  on  the  nature  of 
macroeconomic  externalities  and  the  way  to  correct  them  are  not  yet  available. 

B.   Involuntary  Unemployment 

We  have  focussed  on  whether  contract  theory  can  explain  ex-post 
inefficient  allocations.   A  related  question  which  has  received  attention  is 
whether  the  theory  can  explain  involuntary  layoffs.   The  results  here  have 
been  rather  disappointing. 

To  understand  the  issues,  let  us  return  to  the  ABG  model,  but  drop  the 

assumption  of  work-sharing.   Instead  we  suppose  that  Jl  =  0  or  1  for  each 

worker  at  date  1.   A  contract  will  now  specify  a  number  of  workers  n(s)  <  m 

who  should  work  in  state  s,  a  payment  I  (s)  to  each  of  these  and  a  payment 

e 

I  (s)  to  each  of  the  laid  off  workers.   The  total  wage  bill  W(s)  in  state  s 
then  equals 


(2.20)    W(s)  =  n(s)  I  (s)  +  (m-n(s))  I  (s) 

e  u 


Since  the  firm  cares  only  about  the  size  of  this  wage  bill  and  not  how  it's 
divided,  an  optimal  contract  must  in  each  state  solve: 


(2.21)    Maximize   n(s)/m  U(I  (s),  1)  +  (1  -  n(s)/ra)  U(I  (s),  0) 

e  u 

S.T.       W(s)  =  W. 


Here  the  maximand  is  the  expected  utility  of  each  worker,  given  that  layoffs 
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are  chosen  randomly.     The  first  order  conditions  for  (2.21)  are 


(2.22)   au/ai  (I  (s),  1)  =  au/ai  (i  (s),  o), 


that  is  retained  and  laid  off  workers  should  have  the  same  marginal  utility  of 
income.   It  is  not  difficult  to  show  that  this  implies  that  laid-off  workers 
are  better  off  than  retained  workers  if  leisure  is  a  normal  good,  worse  off  if 
it's  inferior  and  equally  well  off  if  U  =  U  (I-g(Jl)),  i.e.,  if  the  demand  for 
leisure  is  income  inelastic. 

Since  it's  hard  to  argue  empirically  that  leisure  is  inferior,  this 

model  gives  us  the  perverse  result  that  there  will  be  ex-post  involuntary  . 

19 
retentions .   Various  attempts  have  been  made  to  get  away  from  this.     One 

approach  is  to  drop  the  assumption  that  the  utility  function  U(I,Jl)  is 

publically  known  at  date  1.   For  example,  suppose  U(I,  Jl)  =  U(  I+R(fi-ii ) )  ,  where 

R,  the  outside  reservation  wage  at  date  1,  is  a  random  variable.   A  simple 

case  is  where  neither  the  firm  nor  the  workers  know  R  when  the  layoff  decision 

is  made  (but  both  know  its  distribution).   Under  these  conditions,  Geanakoplos 

and  Ito  (1982)  have  shown  that  the  optimal  contract  will  involve  involuntary 

layoffs  only  if  U  exhibits  increasing  absolute  risk  aversion  (which  is  usually 

regarded  as  implausible) . 

A  second  case  is  where  R  is  known  to  the  workers  but  not  to  the  firm. 

If  workers'  R's  are  correlated  and  there  are  many  of  them,  it  is  likely  that 

the  firm  will  be  able  to  elicit  the  common  component,  and  so  the  natural  case 

to  study  is  where  the  R's  are  independently  drawn  from  a  known  distribution. 

In  this  case,  however,  Moore  (1985)  has  shown  that  the  utility  function  U 

gives  rise  to  involuntary  retentions  if  A  <  S.    (in  contrast,  involuntary 

"20 
layoffs  occur  if  A  >  A;  e.g.  with  the  utility  function  U  (I  -  RJl )  )  .     One 

disturbing  feature  of  any  contract  where  retention  is  involuntary  is  that  it 
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gives  workers  an  incentive  to  be  fired.   Several  papers  have  built  on  this, 
developing  models  in  which  involuntary  layoffs  are  part  of  an  incentive  scheme 
to  encourage  the  firm's  work  force  to  work  hard  (see  Malcomson  (1984),  Hahn 
(1984)  and  Eden  (1985)).   These  models  may  apply  to  situations  where  employers 
have  discretion  about  whom  to  lay  off,  but  in  practice  this  appears  usually 
not  to  be  the  case  —  in  union  contracts,  for  example,  layoffs  are  almost 
always  by  seniority. 

It  should  be  emphasized  that  none  of  these  theories  explains  involuntary 
unemployment  at  the  contract  date.   The  reason  is  that,  if  there  is  a 
competitive  labor  market  at  date  0,  an  optimal  contract  will  have  the  property 
that  each  employed  worker's  expected  utility  equals  0,  the  market  clearing 
level.   In  particular,  it  cannot  be  an  equilibrium  for  employed  workers  to 
receive  more  than  U  and  employment  to  be  rationed,  since  individual  firms 
could  then  increase  profit  by  reducing  wages,  I(s),  in  each  state  (without 
distorting  incentives).   This  conclusion  is  subject  to  some  qualifications. 
First,  it  may  be  impossible  to  reduce  wages  in  some  states  because  workers  are 
at  the  boundary  of  their  consumption  set.   Secondly,  in  models  involving 
worker  effort,  if  a  worker's  utility  function  U(e,I)  is  appropriately 
nonseparable  in  effort  and  income,  a  reduction  in  I  may  have  a  sufficiently 
adverse  effect  on  a  worker's  desire  to  work  to  be  unprofitable  for  the  firm 
(see,  e.g.,  Malcomson  (1981);  a  similar  incentive  effect  underlies  much  of  the 
efficiency  wage  literature;  see,  e.g.,  Shapiro-Stiglitz  (1984)).   In  both  of 
these  cases,  employed  workers  may  receive  more  than  U.   Thirdly,  involuntary 
unemployment  at  the  contract  date  is  possible  in  models  where  there  is  adverse 
selection  at  date  0,  an  important  case  which  falls  outside  the  scope  of  this 
survey  (see  Weiss  (1980)  and  Stiglitz-Weiss  (1981)). 
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C.   "Long-term"  (Repeated)  Contracts 


Labor  contracts,  whether  implicit  or  explicit,  have  been  regarded  as 
most  important  in  long-term  relationships.   To  formalize  these  relationships 
as  a  "one-shot"  situation  as  in  II. 1  -  II. 3  does  not  appear  very  satisfactory. 
Nevertheless,  there  are  dynamic  versions  for  which  the  preceding  analysis 
applies  essentially  intact.   A  particular  example  that  fits  precisely  the 
structure  in  (2.9)  is  in  Fudenberg  et  al  (1986). 

Consider  an  infinitely  (and  independently)  repeated  version  of  the  one- 
period  model  studied  above.   The  manager's  utility  over  a  consumption  stream 
{c.}  is  given  by  S  -  5  exp(-rc^),  where  5  is  the  discount  factor  and  r  is  the 
manager's  coefficient  of  absolute  risk  aversion.   The  manager  can  borrow  and 
save  freely  at  the  interest  rate  (l-5)/5.   This  is  not  observed  by  the 
principal . 

As  discussed  in  1.7,  an  optimal  long-term  contract  can  in  this  situation 
be  duplicated  by  a  sequence  of  short-term  contracts  —  with  exponential 
utility  and  independent  shocks  a  sequence  of  identical  short-term  contracts. 
Note,  however,  that  an  optimal  one-period  contract  in  the  dynamic  model  is  not 
the  same  as  in  the  static  model,  because  the  manager  can  smooth  consumption. 
Instead,  the  one-period  solution  in  the  dynamic  case  is  the  same  as  if  the 
manager  worked  just  once,  but  consumed  forever.   (Because  there  are  no  income 
effects  and  shocks  are  independent,  contracts  across  periods  do  not  affect 
each  other.)   This  program  can  be  reduced  to  the  form  (2.9)  as  follows. 

Assume  the  manager  consumes  after  he  is  paid  in  the  single  period  he 
works  (this  is  the  reverse  of  Fudenberg  et  al ,  but  of  no  consequence  for  the 
decomposition  result).   Let  w.  be  the  manager's  net  wealth  if  s.  occurs  in 
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that  period  (i.e.  w.  =  f(s  ,  L.)  +  I.,  i  =  1,2,  in  our  earlier  notation). 
With  no  further  income,  the  manager  will  consume  the  interest  on  his  wealth  in 
all  future  periods,  that  is,  he  will  consume  (l-5)w.  forever.   This  implies  a 
life-time  utility  V(w.)  =  -exp{-r( 1-5 )w. }/{ 1-5 ) .   Consequently,  using  this  V 
as  the  manager's  utility  function  in  (2.9),  we  obtain  the  optimal  short-term 
contract  for  the  dynamic  case. 

Notice  that  the  only  difference  between  the  static  problem  and  the 
dynamic  (short-term)  problem  is  that  the  manager's  risk  aversion  coefficient 
is  smaller  in  the  latter.   In  the  dynamic  case  the  coefficient  is  r(l-5), 
while  in  the  static  case  it  is  r.   The  reduction  in  risk  aversion  comes  from 
self-insurance  in  the  dynamic  model.   In  the  limit,  as  5  goes  to  one  and  there 
is  no  discounting  of  the  future,  the  manager  acts  effectively  in  a  risk 
neutral  fashion.   One  optimal  (and  first-best)  solution  in  that  situation  is 
to  rent  out  the  technology  to  the  manager  and  let  him  carry  all  the  risk. 
(Recall  our  earlier  comment  on  Yaari ' s  work  in  1.7.) 

Since  the  introduction  of  dynamics  in  this  example  only  changes  the 
manager's  risk  aversion  coefficient,  the  earlier  static  analysis  applies 
directly.   We  conclude  that  while  there  will  be  a  smaller  allocational 
distortion  in  the  multi-period  situation  than  in  the  one-period  situation,  in 
both  cases  the  distortion  will  be  qualitatively  the  same. 

It  is  also  worth  noting  that  not  all  long-run  relationships  are  subject 
to  independent  shocks.   With  serial  correlation  of  the  s.'s,  however,  the 
gains  from  self -insurance  may  be  substantially  reduced.   For  instance,  in  the 
extreme  case  of  a  single  shock  that  persists  forever,  there  are  no  self- 
insurance  gains  at  all,  and  the  optimal  long-term  contract  will  be  the 
repeated  static  contract  from  (2.9).   More  generally,  with  positive 
correlation,  repetition  will  have  a  smaller  effect  in  reducing  the  level  of 
second-best  inefficiency  than  with  independent  shocks;  in  the  extreme  case  of 


79 

perfect  correlation  of  the  s.'s  there  will  be  no  reduction  at  all  (but  see  D 
below) . 

\ 

D.   Enforcement  of  the  Contract 


The  asymmetric  information  contract  models  are  sometimes  criticized  on 
the  grounds  that  "while  the  parties  may  agree  in  advance  to  have  unemployment 
in  bad  states  of  the  world,  they  will  surely  change  their  mind  once  such  a 
state  is  realized".   To  understand  this,  consider  a  firm  that  signs  a  contract 
with  a  single  worker,  and  suppose  that  &  =  0  or  1,  there  are  two  states  s  =  s 
or  s  =  s   (s  >  s  ),  and  the  ex-post  opportunity  cost  of  labor  is  zero.   An 
optimal  second-best  contract  might  have  the  property  that  Q.   =  1   when  s  =  s 
and  fi.  =  0  when  s  =  s  .   But  suppose  now  that  s  =  s   is  realized  and  the  firm 
lays  off  the  worker.   Then  the  argument  goes  that  the  firm  and  worker  will 
recontract  at  this  stage  since  they  will  recognize  that  there  are  some 
unexploited  gains  from  trade  (assuming  that  s   >  0). 

Such  recontracting  can  only  make  the  parties  worse  off  in  ex-ante  terms 
(assuming  it  is  anticipated)  —  otherwise  the  original  contract  would  not  have 
been  an  optimal  one.   The  question  therefore  is  whether  the  parties  can 
precommit  themselves  not  to  renegotiate.   In  the  static  model  of  II. 3,  the 
answer  seems  to  be  yes.   Presumably  there  is  a  "last  moment"  at  which 
employment  decisions  must  be  made.   Let  the  original  contract  state  that  the 
firm  can  change  its  mind  about  whether  to  employ  the  worker  up  to  this  last 
moment.   Then  any  threat  to  lay  off  the  worker  before  the  last  moment  is  not 
credible  since  the  worker  knows  that  the  firm  can  costlessly  change  its  mind, 
while  a  threat  at  the  last  moment  is,  of  course,  useless  to  the  firm  since  by 
that  time  it  is  too  late  to  renegotiate. 
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The  recontracting  criticism  does  not  therefore  seem  to  be  valid  when 
there  is  only  one  employment  date.   However,  it  does  have  force  in  a  dynamic 
context.   Change  the  above  example  so  that  the  worker  can  work  or  not  work  on 
each  of  T  "days"  (but  suppose,  in  contrast  to  II. 4B  that  the  shock  s  is  the 
same  for  all  days).   The  optimal  second-best  contract  might  call  for  the 
worker  to  be  laid  off  for  1  <  T  <  T  days  in  the  bad  state,  s  =  s  .   However, 
it  is  hard  to  see  what  is  to  stop  the  parties  from  renegotiating  such  a 
contract  after  one  day  of  unemployment,  given  that  there  are  clearly 

unexploited  gains  from  trade  at  this  point  and  that  the  only  irreversible 

21 
decision  which  has  been  made  concerns  the  first  day's  layoff. 

In  future  work  it  would  seem  interesting  to  investigate  the  constraints 

22 
that  such  renegotiation  puts  on  dynamic  contracts.    We  shall  return  to  the 

issue  of  ex-post  renegotiation  in  Part  III. 
II. 5   Summary  and  Conclusions 


There  seem  to  be  two  major  conclusions  from  the  labor  contract 
literature.   First,  in  an  optimal  contract,  there  will  be  systematic 
discrepancies  between  wages  and  the  marginal  product  of  labor.   Secondly, 
under  asymmetric  information,  there  will  be  ex-post  inefficiencies. 

Both  these  conclusions  have  important  implications  for  the  way  we  think 
about  labor  markets.   In  almost  all  empirical  work  on  the  labor  market,  for 
instance,  it  is  taken  for  granted  that  wages  measure  the  opportunity  cost  of 
labor,  and  that  firms  will  be  on  their  demand  curves  or  workers  on  their 
supply  curves  or  both.   In  a  contracting  framework,  as  Rosen  (1985)  has 
stressed,  none  of  these  suppositions  is  valid.   To  take  another  example,  it  is 
often  assumed  that  the  following  is  a  good  model  of  union  behavior:   the  union 
chooses  the  wage  rate  to  maximize  the  representative  worker's  utility  subject 
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to  the  constraint  that  the  firm  will  be  on  its  labor  demand  curve.   According 
to  the  contracting  framework,  however,  such  behavior  is  irrational  since  both 
the  firm  and  union  can  make  themselves  better  off  by  agreeing  on  a  wage- 
employment  pair  that  lies  on  the  efficiency  frontier. 

In  view  of  the  strong  implications  of  the  labor  contract  approach,  it  is 
important  to  know  how  well  the  theory  "matches  up"  with  the  facts.   Serious 
econometric  work  on  this  topic  is  only  just  beginning,  but  some  interesting 
papers  by  Ashenf elter-Brown  (1982)  and  Card  (1985)  are  already  available. 
These  papers  test  the  prediction  of  the  ABG  model  that  ex-post  employment 
levels  can  be  explained  by  opportunity  costs  rather  than  actual  wages  (as  in 
(2.7)).   The  results  obtained  so  far  suggest  little  support  for  this 
hypothesis,  but  it  is  possible  that  some  of  the  explanatory  power  of  actual 
wages  found  by  Ashenf elter-Brown  and  Card  can  be  traced  to  asymmetries  of 
information  (as  in  the  model  of  II. 3)  rather  than  being  a  rejection  of  the 
optimal  contracting  approach  per  se.   Unfortunately,  testing  the  asymmetric 
information  contract  model  directly  seems  a  very  difficult  task  and  we  are  not 
aware  of  any  attempts  so  far  in  this  direction. 

A  much  less  formal  empirical  approach,  which  has  been  adopted  by  Oswald 
(1984),  is  to  examine  actual  labor  contracts  to  see  whether  they  contain  the 
features  that  one  might  expect  from  the  theory.   The  results  here  have  again 
been  less  than  favorable  to  the  contracting  approach.   First,  most  non-union 
contracts  are  surprisingly  rudimentary,  sometimes  consisting  of  as  little  as  a 
verbal  statement  that  an  employee  has  a  job  at  a  particular  (current)  wage. 
Secondly,  union  contracts,  although  frequently  lengthy  and  complex,  do  not 
contain  a  number  of  the  provisions  that  the  theory  suggests  they  should.   For 
example,  it  is  rare  to  find  joint  agreements  on  wages  and  employment; 
typically  wage  rates  are  specified  over  the  course  of  the  contract,  but 
employment  decisions  are  left  to  the  firm  (while  this  is  not  inconsistent  with 
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the  Bodel  of  II. 3,  in  more  general  asymmetric  information  models,  where,  e.g., 
workers  and  firms  both  have  private  information,  an  optimal  contract  will 
involve  joint  determination  of  employment  by  firms  and  workers).   Other 
anomalies  are  the  lack  of  indexation  of  wages  to  retail  prices  or  to  variables 
such  as  firm  employment  or  firm  sales,  and  the  limited  provisions  for  layoff 

pay. 

Of  course,  one  possible  escape  for  the  contract  theorist  is  to  argue 
that  whatever  does  not  appear  in  the  explicit  contract  is  simply  part  of  an 
implicit  contract  (see  the  Introduction).   This  is  akin  to  the  proposition 
that  a  theory  should  be  judged  by  its  predictions  {e.g.,  whether  employment 
levels  are  determined  solely  by  opportunity  costs)  rather  than  by  its 
assumptions  (whether  a  particular  contractual  provision  is  physically 
present) .   While  there  is  surely  something  in  this  idea,  it  seems  a 
considerable  act  of  faith  to  rely  on  the  notion  of  an  implicit  contract  given 
that  so  little  is  presently  known  about  how  implicit  agreements  are  enforced 
(but  see  Part  I I 1.4).   In  fact  in  view  of  the  current  ignorance  about  this,  it 
seems  curious  —  and  unfortunate  —  that  the  whole  field  often  goes  under  the 
name  of  Implicit  Labor  Contracts. 

Given  that  empirical  support  for  the  labor  contract  model  is  at  present 
rather  limited,  the  question  arises  whether  the  contracting  approach  is  worth 
pursuing.   Not  surprisingly,  we  feel  strongly  that  the  answer  is  yes.   The 
main  reason  is  that  there  appears  to  be  no  serious  alternative  around  for 
analyzing  this  class  of  problems.   For  example,  the  wage-setting-union  model 
described  above  may  fit  some  of  the  facts  better,  but  it  is  based  on  the 
assumption  that  the  parties  fail  to  exploit  all  the  gains  from  trade,  which, 
in  theoretical  terms,  seems  unacceptable.   Rather  than  abandoning  the 
contracting  framework,  therefore,  it  seems  desirable  to  try  to  modify  it  so  as 
to  make  it  more  realistic,  e.g.  by  incorporating  further  moral  hazards  or 
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asymmetries  of  information  or  —  and  perhaps  this  is  most  important  —  by 
introducing  the  costs  of  writing  contracts  (see  Part  III).   It  should  also  be 
noted  that  firm/worker  relationships  are  only  one  application  of  the 
contracting  framework.   In  Part  III,  we  argue  that  other  applications,  e.g.  to 
input  supply  contracts  between  firms,  may  in  the  long-run  be  at  least  as 
fruitful,  as  well  perhaps  as  being  more  consistent  with  the  facts. 
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Appendix  to  Part  II 


It  is  easy  to  show  that,  if  f  is  continuous,  a  solution  to  (2.9)  exists. 
Denote  it  by  (L  ,  L  ,  I  ,  I  ).   Clearly  tt,  I,  +  it   1=0.   Furthermore,  at 
least  one  of  the  truth-telling  constraints  must  be  binding  (otherwise  a  Pareto 
improvement  could  be  made  by  moving  in  the  direction  of  the  first-best) .   We 
consider  three  cases. 


Case  1  (both  truth-telling  constraints  are  binding): 


^(^2-  ^2^  "  h  =  ^(^2'  hi)   -  ^r 


f(s^.  L^)  .    1^    =  f(s^.  L^)  -  I^. 


(1) 


In  this  case  the  manager  is  indifferent  between  the  two  states.   Hence  I   =  I 
=  0  since  if  I .  <  I  . ,  a  Pareto  improvement  could  be  achieved  by  replacing  the 
old  contract  by  a  new  contract  (L.,  I.,  L.,  I.).   But,  if  1=1=  0,  it  is 
optimal  to  set  L.  equal  to  its  first-best  value,  L.*,  i  =  1,2,  which 
contradicts  (1).   Therefore  Case  1  is  impossible. 


Case  2  (only  the  second  truth-telling  constraint  is  binding) 


^^^'  ii^  ^  ^1  =  ^(^'  hz^   "  h' 


f(s2.  L^)  -  I2  >  f(32.  jbi)  ^  h 


(2) 


The  second  inequality,  together  with  (2.10),  implies  that  f(s  ,  L  )  +  I   > 
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f{s  ,  L  )  +  I  ,  i.e.  the  manager  prefers  s   to  s  .   In  this  case,  however,  by 
a  standard  risk-sharing  argument,  a  Pareto  improvement  can  be  made  by  lowering 
I   and  raising  I  ,  keeping  tt   I   +  tt   I   constant  (the  truth-telling 
constraints  will  continue  to  be  satisfied).   Hence  Case  2  is  ruled  out. 

We  are  left  with  Case  3,  where  only  the  first  truth-telling  constraint 
is  binding;  this  case  is  analyzed  in  the  text. 
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PART  III   Incomplete  Contracts 

III.l   The  Benefits  of  Writing  Long-Term  Contracts 

The  literature  on  labor  contracts  focusses  on  Incorae-shlf ting  as  the 
motivation  for  a  long-term  contract;  that  is,  on  the  gains  the  parties  receive 
from  transferring  income  from  one  state  of  the  world  or  one  period  to  another. 
In  the  ABG  model,  the  worker  wants  to  insure  his  income.   This  is  also  the 
case  in  the  Holmstrom  model,  where  in  addition  the  worker  wants  to  smooth  his 
consumption  over  time.   Finally,  in  the  Azariadis/Grossman-Hart  model,  it  is 
the  entrepreneur/manager  who  desires  insurance. 

In  all  these  models,  the  rationale  for  the  contract  would  disappear  if 
the  agents  were  risk-neutral  and  faced  perfect  capital  markets.   Even  if  risk 
aversion  and  imperfect  capital  markets  are  present,  the  ABG  and  Holmstron 
explanations  of  labor  contracts  rely  on  the  assumption  that  firms  have  a 
comparative  advantage  in  providing  insurance  and  income-smoothing 
opportunities  to  workers. 

It  is  perhaps  unfortunate  that  so  much  attention  has  been  devoted  to 
"financial"  contracts  of  this  type.   As  we  noted  in  the  introduction,  a 
fundamental  reason  for  long-term  relationships  is  the  existence  of  investments 
which  are  to  some  extent  party-specific.   While  this  lock-in  effect  is  often 
used  to  motivate  the  long-term  relationship  between  workers  and  firms  in  labor 
contract  models,  it  then  tends  to  be  ignored.   Yet  this  lock-in  effect  can 
explain  the  existence  and  characteristics  of  long-term  contracts  even  in  the 
presence  of  risk  neutrality  and  perfect  capital  markets.   Moreover,  in  the 
case,  say,  of  supply  contracts  involving  large  firms,  risk  neutrality  and 
perfect  capital  markets  may  be  reasonable  assumptions  in  view  of  the  many 
outside  insurance  and  borrowing/lending  opportunities  available  to  such 
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parties . 

The  importance  of  a  long-term  contract  when  there  are  relationship- 
specific  investments  can  be  seen  from  the  following  example  (based  on  Grout 
(1984);  see  also  Crawford  (1982)).   Let  B,  S  be,  respectively,  the  buyer  and 
seller  of  (one  unit  of)  an  input.   Suppose  that  in  order  to  realize  the 
benefits  of  the  input,  B  must  make  an  investment,  a,  which  is  specific  to  S; 
for  example,  B  might  have  to  build  a  plant  next  to  S.   Assume  that  there  are 
just  two  periods;  the  investment  is  made  at  date  0,  while  the  input  is 
supplied  and  the  benefits  are  received  at  date  1.   S's  supply  cost  at  date  1 
is  c,  while  B's  benefit  function  is  b(a)  (all  costs  and  benefits  are  measured 
in  date  1  dollars). 

If  no  long-term  contract  is  written  at  date  0,  the  parties  will 
determine  the  terms  of  trade  from  scratch  at  date  1.   If  we  assume  that 
neither  party  has  alternative  trading  partners  at  date  1,  there  is,  given  B's 
sunk  investment  cost  a,  a  surplus  of  b(a)  -  c  to  be  divided  up.   A  simple 
assumption  to  make  is  that  the  parties  split  this  50:50  (this  is  the  Nash 
bargaining  solution).   That  is,  the  input  price  p  will  satisfy  b(a)-p  =  p-c. 
This  means  that  the  buyer's  overall  payoff,  net  of  his  investment  cost,  is 


,  ,  .  b(a)-c 

(3.1)        ^^^^  -  P  -  a  =  --—   -  a. 


The  buyer,  anticipating  this  payoff,  will  choose  a  to  maximize  (3.1),  i.e.  to 
maximize  1/2  b(a)  -  a. 

This  is  to  be  contrasted  with  the  efficient  outcome,  where  a  is  chosen 
to  maximize  total  surplus,  b(a)  -  c  -  a.   Maximizing  (3.1)  will  lead  to 
underinvestment;  in  fact,  in  extreme  cases,  a  will  equal  zero  and  trade  will 
not  occur  at  all.   The  inefficiency  arises  because  the  buyer  does  not  receive 
the  full  return  from  his  investment  —  some  of  this  return  is  appropriated  by 
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the  seller  in  the  date  1  bargaining.   Note  that  an  upfront  payment  from  S  to  B 
at  date  0  (to  compensate  for  the  share  of  the  surplus  S  will  later  receive) 
will  not  help  here,  since  it  will  only  change  B's  objective  function  by  a 
constant  (it's  like  a  lump-sum  transfer).   That  is,  it  redistributes  income 
without  affecting  real  decisions. 

Efficiency  can  be  achieved  if  a  long-term  contract  is  written  at  date  0 
specifying  the  input  price  p*  in  advance.   Then  B  will  maximize  b(a)  -  p*  -  a, 
yielding  the  efficient  investment  level,  a*.   An  alternative  method  is  to 
specify  that  the  buyer  must  choose  a  =  a*  (if  not  he  pays  large  damages  to  S) 
—  the  choice  of  p  can  then  be  left  until  date  1,  with  an  upfront  payment  by  S 
being  used  to  compensate  B  for  his  investment.   The  second  method  presupposes 
that  investment  decisions  are  publically  observable,  and  so  in  practice  may  be 
more  complicated  than  the  first  (see  III. 3). 

This  example  (which  formalizes  intuitions  contained  in  Williamson  (1985) 
and  Klein,  Crawford  and  Alchian  (1978))  illustrates  the  role  of  a  long-term 
contract  when  there  are  relationship-specific  investments.   The  word 
"investment"  should  be  interpreted  broadly;  the  same  factors  will  apply 
whenever  one  party  is  forced  to  pass  up  an  opportunity  as  a  result  of  a 
relationship  with  another  party  (e.g.,  A's  "investment"  in  the  relationship 
with  B  may  be  not  to  lock  into  C).   That  is,  the  crucial  element  is  a  sunk 
cost  (direct  or  opportunity)  of  some  sort  (an  effort  decision  is  one  example 
of  a  sunk  cost).   Note  that  the  income-transfer  motive  for  a  long-term 
contract  is  completely  absent  here;  T:here  is  no  uncertainty  and  everything  is 
in  present  value  terms. 

In  spite  of  their  importance,  the  analysis  of  "real"  contracts  of  this 
type  ("real",  rather  than  "financial",  because  their  rationale  comes  from  the 
existence  of  real  decisions  such  as  investments)  is  in  its  infancy.   A  notable 
early  reference  is  Becker's  (1964)  analysis  of  worker  training.   More 
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recently,  Williamson  (1985)  and  Klein,  Crawford  and  Alchian  (1978)  have 
emphasized  the  difficulty  of  writing  contracts  which  induce  efficient 
relationship-specific  investments  as  an  important  factor  in  explaining 
vertical  integration. 

Part  III  of  this  survey  will  be  concerned  primarily  with  the  analysis  of 
such  "real"  contracts.   At  this  stage,  however,  it  may  be  useful  to  summarize 
the  general  benefits  of  writing  long-term  contracts.   We  have  discussed  the 
income-transfer  and  "real"  motives.   Let  us  note  three  further  benefits. 
First,  if  a  relationship  is  repetitive,  it  may  save  on  transaction  costs  to 
decide  in  advance  what  actions  each  party  should  take  rather  than  to  negotiate 
a  succession  of  short-terra  contracts.   Secondly,  if  asymmetries  of  information 
arise  during  the  course  of  the  relationship,  letting  the  parties  negotiate  as 
they  go  along  may  lead  to  ex-post  bargaining  inefficiencies  (as  in,  e.g., 
Fudenberg-Tirole  (1983)),  which  can  be  avoided  by  a  long-term  contract. 
Thirdly,  a  long-term  contract  may  be  useful  for  screening  purposes;  e.g.  a 
firm  may  attract  a  productive  worker  by  offering  a  high  future  reward  in  the 
event  that  the  worker  is  successful.   (This  is  an  example  drawn  from  the 
adverse  selection  literature;  see,  e.g.,  Salop-Salop  (1976).) 

Given  the  many  advantages  of  long-term  contracts,  the  question  that 
obviously  arises  is  why  we  don't  see  more  of  them,  and  why  those  we  do  see 
seem  often  to  be  limited  in  scope.   To  this  question  we  now  turn. 

Ill .2   The  Costs  of  Writing  Long-Term  Contracts 

Contract  theory  is  sometimes  dismissed  because  "we  don't  see  the  long- 
term  contingent  contracts  that  the  theory  predicts".   In  view  of  the  benefits 
of  long-term  contracts,  this  statement,  even  if  true,  needs  to  be  explained. 

The  first  point  to  make  is  that  there  is  no  shortage  of  complex  long- 
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term  contracts  in  the  world.   Joskow  (1985),  for  example,  in  his  recent  study 
of  transactions  between  electricity  generating  plants  and  mine-mouth  coal 
suppliers  finds  that  some  contracts  between  the  parties  extend  for  50  years, 
and  a  large  majority  for  over  ten  years.   The  contractual  terms  include 
quality  provisions,  formulae  linking  coal  prices  to  costs  and  prices  of 
substitutes,  indexation  clauses,  etc.,  etc.   The  contracts  are  both 
complicated  and  sophisticated.   Similar  findings  are  contained  in  Goldberg  and 
Ericson's  (1982)  study  of  Petroleum  Coke. 

At  a  much  more  basic  level,  a  typical  contract  for  personal  insurance, 
with  its  many  conditions  and  exemption  clauses  is  not  exactly  a  simple 
document.   Nor  for  that  matter  is  a  typical  house  rental  agreement.   On  the 
other  hand,  as  we  noted  in  Part  II,  labor  contracts  are  often  surprisingly 
rudimentary,  at  least  in  certain  respects. 

Given  that  complex  long-term  contracts  are  found  in  some  situations  but 
not  others,  it  is  natural  to  explain  any  observed  contract  as  an  outcome  of  an 
optimization  process  in  which  the  relative  benefits  and  costs  of  additional 
length  and  complexity  are  traded  off  at  the  margin.   We  have  given  some 
indication  of  the  determinants  of  the  benefits  of  length  and  complexity.   But 
what  about  the  costs?   These  are  much  harder  to  pin  down  since  they  fall  under 
the  general  heading  of  "transaction  costs",  a  notoriously  vague  and  slippery 
category.   Of  these,  the  following  seem  to  be  important:   (1)  the  cost  to  each 
party  of  anticipating  the  various  eventualities  that  may  occur  during  the  life 
of  the  relationship;  (2)  the  cost  of  deciding,  and  reaching  an  agreement 
about,  how  to  deal  with  such  eventualities;  (3)  the  cost  of  writing  the 
contract  in  a  sufficiently  clear  and  unambiguous  way  that  the  terms  of  the 
contract  can  be  enforced;  and  (4)  the  legal  cost  of  enforcement. 

One  point  to  note  is  that  all  these  costs  are  present  also  in  the  case 
of  short-term  contracts,  although  presumably  they  are  usually  smaller.   In 
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particular,  since  the  short-term  future  is  more  predictable,  the  first  cost  is 
likely  to  be  much  reduced,  and  so  possibly  is  the  third.   However,  it 
certainly  isn't  the  case  that  there  is  a  sharp  division  between  short-term 
contracts  and  long-term  contracts,  with,  as  is  sometimes  supposed,  the  former 
being  costless  and  the  latter  being  infinitely  costly. 

It  is  also  worth  emphasizing  that,  when  we  talk  about  the  cost  of  a 
long-term  contract,  we  are  presumably  referring  to  the  cost  of  a  "good"  long- 
term  contract.   There  is  rarely  significant  cost  or  difficulty  in  writing  some 
long-term  contract.   For  example,  the  parties  to  an  input  supply  contract 
could  agree  on  a  fixed  price  and  level  of  supply  for  the  next  fifty  years. 
They  don't  presumably  because  such  a  rigid  arrangement  would  be  very 
inefficient. 

Due  to  the  presence  of  transaction  costs,  the  contracts  people  write 
will  be  incomplete  in  important  respects.   The  parties  will  quite  rationally 
leave  out  many  contingencies,  taking  the  point  of  view  that  it  is  better  to 
"wait  and  see  what  happens"  than  to  try  to  cover  a  large  number  of 
individually  unlikely  eventualities.   Less  rationally,  the  parties  will  leave 
out  other  contingencies  that  they  simply  do  not  anticipate.   Instead  of 
writing  very  long-term  contracts  the  parties  will  write  limited  terra 
contracts,  with  the  intention  of  renegotiating  these  when  they  come  to  an  end. 
Contracts  will  often  contain  clauses  which  are  vague  or  ambiguous,  sometimes 
fatally  so. 

Anyone  familiar  with  the  legal  literature  on  contracts  will  be  aware 
that  almost  every  contractual  dispute  that  comes  before  the  courts  concerns  a 
matter  of  incompleteness.   In  fact,  incompleteness  is  probably  at  least  as 
important  empirically  as  asymmetric  information  as  an  explanation  for 
departures  from  "ideal"  Arrow-Debreu  contingent  contracts.   In  spite  of  this, 
relatively  little  work  has  been  done  on  this  topic,  the  reason  presumably 
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being  that  an  analysis  of  transaction  costs  is  so  complicated.   One  problem  is 
that  the  first  two  transaction  costs  referred  to  above  are  intimately 
connected  to  the  idea  of  bounded  rationality  (as  in  Simon  (1982)),  a 
successful  formalization  of  which  doesn't  yet  exist.   As  a  result,  perhaps, 
the  few  attempts  that  have  been  made  to  analyze  incompleteness  have 
concentrated  on  the  third  cost,  the  cost  of  writing  the  contract. 

One  approach,  due  to  Dye  (1985),  can  be  described  as  follows.   Suppose 
that  the  amount  of  input,  q,  traded  between  a  buyer  and  seller  should  be  a 
function  of  the  product  price,  p,  faced  by  the  buyer:   q  =  f(p).   Writing  down 
this  function  is  likely  to  be  costly.   Dye  measures  the  costs  in  terms  of  how 
many  different  values  q  takes  on  as  p  varies;  in  particular,  if  #{q|q=f(p)  for 
some  p}=n,  the  cost  of  the  contract  is  (n-l)c,  where  c>0.   This  means  that  a 
noncontingent  statement  "q=5  for  all  p"  has  zero  cost,  the  statement  "q=5  for 
p<8,  q=10  for  p>8"  has  cost  c,  and  so  on. 

The  costs  Dye  is  trying  to  capture  are  real  enough,  but  the  measure  used 

1/2 
has  some  drawbacks.   It  implies  for  example,  that  the  statement  "q=p     for 

all  p"  has  infinite  cost  if  p  has  infinite  domain,  and  does  not  distinguish 

between  the  cost  of  a  simple  function  like  this  and  the  cost  of  a  much  more 

complicated  function.   As  another  example,  a  simple  indexation  clause  to  the 

effect  that  the  real  wage  should  be  constant  (i.e.  the  money  wage  =  X.  p  for 

some  X.)  would  never  be  observed  since,  according  to  Dye's  measure,  it  too  has 

infinite  cost.   In  addition,  the  approach  does  not  tell  us  how  to  assess  the 

cost  of  indirect  ways  of  making  q  contingent;  for  example,  the  contract  could 

specify  that  the  buyer,  having  observed  p,  can  choose  any  amount  of  input  q  he 

likes,  subject  to  paying  the  seller  o  for  each  unit. 

There  is  another  way  of  getting  at  the  cost  of  including  contingent 

statements.   This  is  to  suppose  that  what  is  costly  is  describing  the  state  of 

the  world  u  rather  than  writing  a  statement  per  se.   That  is,  suppose  that  u 
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cannot  be  represented  simply  by  a  product  price,  but  is  very  complex  and  of 
high  dimension  —  e.g.,  it  includes  the  state  of  demand,  what  other  firms  in 
the  industry  are  doing,  the  state  of  technology,  etc.   Many  of  these 
components  may  be  quite  nebulous.   To  describe  the  state  ex-ante  in  sufficient 
detail  that  an  outsider,  e.g.  the  courts,  can  verify  whether  a  particular 
state  u)  =  u»  has  occurred,  and  so  enforce  the  contract,  may  be  prohibitively 
costly.   Under  these  conditions,  the  contract  will  have  to  omit  some  (in 
extreme  cases,  all)  references  to  the  underlying  state. 

Similar  to  this  is  the  case  where  what  is  costly  is  describing  the 
characteristics  of  what  is  traded  or  the  actions  (e.g.  investments)  the 
parties  must  take.   For  example,  suppose  that  there  is  only  one  state  of  the 
world,  but  that  q  now  represents  the  quality  of  the  item  traded  rather  than 
the  quantity.   An  ideal  contract  would  give  a  precise  description  of  q. 
However,  quality  may  be  multidimensional  and  very  difficult  to  describe 
unambiguously  (and  vague  statements  to  the  effect  that  quality  should  be 
"good"  may  be  almost  meaningless).   The  result  may  be  that  the  contract  will 
have  to  be  silent  on  many  aspects  of  quality  and/or  actions. 

Models  of  this  sort  of  incompleteness  have  been  investigated  by 
Grossman-  Hart  (1984)  and  Hart-Moore  (1985)  for  the  case  where  the  state  of 
the  world  cannot  be  described  and  by  Bull  (1985)  and  Grossman-Hart  (1984), 
(1986)  for  the  case  where  quality  and/or  actions  cannot  be  specified.   These 
models  do  not  rely  on  any  asymmetry  of  information  between  the  parties.   Both 
parties  may  recognize  that  the  state  of  the  world  is  such  that  the  buyer's 
benefit  is  high  or  the  seller's  cost  is  low,  or  that  the  quality  of  an  item  is 
good  or  bad  or  that  an  investment  decision  is  appropriate  or  not.   The 
difficulty  is  conveying  this  information  to  others.   That  is,  it  is  the 
asymmetry  of  information  between  the  parties  on  the  one  hand,  and  outsiders, 
such  as  the  courts,  on  the  other  hand,  which  is  the  root  of  the  problem. 
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To  use  the  jargon,  incompleteness  arises  because  states  of  the  world, 
quality  and  actions  are  observable  (to  the  contractual  parties)  but  not 
verifiable  (to  outsiders). 

We  describe  an  example  of  an  incomplete  contract  along  these  lines  in 
the  next  section. 
III. 3   Incomplete  Contracts:   An  Example 


We  will  give  an  example  of  an  incomplete  contract  for  the  case  where  it 
is  prohibitively  costly  to  specify  the  quality  characteristics  of  the  item  to 
be  exchanged  or  the  parties'  investment  decisions.   Similar  problems  arise 
when  the  state  of  the  world  cannot  be  described.   The  example  is  a  variant  of 
the  models  in  Grossman-Hart  (1984,  1986),  Hart-Moore  (1985). 

Consider  a  buyer  B  who  wishes  to  purchase  a  unit  of  input  from  a  seller 
S.   B  and  S  each  make  a  (simultaneous)  specific  investment  at  date  0  and  trade 
occurs  at  date  1.   Let  I  ,  I   denote,  respectively,  the  investments  of  B  and 

D      O 

S,  and  to  simplify  assume  that  each  can  take  on  only  two  values,  H  or  L  (high 
or  low).   These  investments  are  observable  to  B  and  S,  but  are  not  verifiable 
(they  are  complex  and  multidimensional,  or  represent  effort  decisions)  and 
hence  are  noncontractible .   We  assume  that  at  date  1  the  seller  can  supply 
either  "satisfactory"  input  or  "unsatisfactory"  input.   "Unsatisfactory"  input 
has  zero  benefit  for  the  buyer  and  zero  cost  for  the  seller  (so  it's  like  not 
supplying  at  all).   "Satisfactory"  input  yields  benefits  and  costs  which 
depend  on  ex-ante  investments.   These  are  indicated  in  Figure  1. 


v« 


^B=  ^ 


^S=^ 


(10,6) 


(9,7) 


(9,7) 


(6,10) 


Figure  1 
The  first  component  refers  to  the  buyer's  benefit. 


V,  and  the  second  to  the 
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seller's  cost,  c.   So  when  I   =  H,  I   =  H,  v  =  10  and  c  =  6  (if  input  is 
"satisfactory").   From  these  gross  benefits  and  costs  must  be  subtracted 
investment  costs,  which  we  assume  to  be  1.9  if  investment  is  high  and  zero  if 
it's  low  (for  each  party).   (All  benefits  and  costs  are  in  date  1  dollars.) 
Note  that  there  is  no  uncertainty  and  so  attitudes  to  risk  are  irrelevant. 

Our  assumption  is  that  the  characteristics  of  the  input  (e.g.  whether 
it's  "satisfactory")  are  observable  to  both  parties,  but  are  too  complicated 
to  be  specified  in  a  contract.   The  fact  that  they  are  observable  means  that 
the  buyer  can  be  given  the  option  to  reject  the  input  at  date  1  if  he  doesn't 
like  it.   This  will  be  important  in  what  follows. 

An  important  feature  of  the  example  is  that  the  seller's  investment 
affects  not  only  the  seller's  costs  but  also  the  buyer's  benefit  and  the 
buyer's  investment  affects  not  only  the  buyer's  benefit  but  also  the  seller's 
costs.   The  idea  here  is  that  a  better  investment  by  the  seller  increases  the 
quality  of  "satisfactory"  input;  and  a  better  investment  by  the  buyer  reduces 
the  cost  of  producing  "satisfactory"  input,  i.e.  input  that  can  be  used  by  the 
buyer . 

For  instance,  one  can  imagine  that  B  is  an  electricity  generating  plant 
and  S  is  a  coal  mine  that  the  plant  sites  next  to.   I   might  refer  to  the  type 

D 

of  coal-burning  boiler  that  the  plant  installs  and  I   to  the  type  of  mine  the 

coal  supplier  develops.   By  investing  in  a  better  boiler,  the  power  plant  may 

be  able  to  burn  lower  quality  coal,  thus  reducing  the  seller's  costs,  while 

still  increasing  its  gross  (of  investment)  profit.   On  the  other  hand,  by 

developing  a  good  seam,  the  mine  may  raise  the  quality  of  coal  supplied  while 

reducing  its  variable  cost. 

The  first-best  has  I   =  I   =  H,  with  total  surplus  equal  to  10-6-3.8  = 

B     S 

.2  (if  I„  =  H  and  I   =  L,  or  vice  versa,  surplus  =  .1  and  if  !„  =  lo  ^  ^'    "° 
B  S  Bo 
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trade  occurs  and  surplus  is  zero) .   This  could  be  achieved  if  either 

investment  or_ quality  were  contractible  as  follows.   If  investment  is 

contractible ,  an  optimal  contract  would  specify  that  the  buyer  must  set  I   =  H 

B 

and  the  seller  I   =  H  and  give  the  buyer  the  right  to  accept  the  input  at  date 

1  at  price  p   or  reject  it  at  price  p  .   If  10  >  p  -  p   >  6,  the  seller  will 

be  induced  to  supply  satisfactory  input  (the  gain,  p  ~  Pr,'  from  having  the 

input  accepted  exceeds  the  seller's  supply  cost)  and  the  buyer  to  accept  it 

(the  buyer's  benefit  exceeds  the  incremental  price  p   -  p_).   If,  on  the  other 

hand,  quality  is  contractible,  the  contract  could  specify  that  the  seller  must 

supply  input  with  the  precise  characteristics  which  make  it  satisfactory  when 

I   =  I   =  H.   Each  party  would  then  have  the  socially  correct  investment 
B     S 

incentives  since,  with  specific  performance,  neither  party's  investment 
affects  the  other's  payoff  (there  is  no  externality). 

We  now  show  that  the  first-best  cannot  be  achieved  if  investment  and 
quality  are  both  noncontractible.   A  second-best  contract  can  make  price  a 
function  of  any  variable  that  is  verifiable.   Investment  and  quality  are  not 
verifiable  (nor  is  v  or  c) ,  but  we  shall  suppose  that  whether  the  item  is 
accepted  or  rejected  by  the  buyer  is,  so  the  contract  can  specify  an 
acceptance  price,  p  ,  and  a  rejection  price,  p  .   In  fact,  p  ,  p   can  also  be 
made  functions  of  (verifiable)  messages  that  the  buyer  and  seller  send  each 
other,  reflecting  the  investment  decisions  that  both  have  made  (as  in  Hart- 
Moore  (1985)).   The  following  argument  is  unaffected  by  such  messages  and  so, 
for  simplicity,  we  ignore  them  (but  see  footnote  7). 

Can  we  sustain  the  first-best  by  an  appropriate  choice  of  p  ,  p  ?  The 

seller  always  has  the  option  of  choosing  I.  =  L  and  producing  an  item  of 

S        1 

unsatisfactory  quality,  which  yields  him  a  net  payoff  of  p  .   In  order  to 
induce  him  not  to  do  this,  we  must  have 
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(3.2)    p^-6-1.9>Pq.  i.e.  p^-Pq>7.9, 


Similarly  the  buyer's  net  payoff  must  be  no  less  than  -p   since  he  always  has 
the  option  of  choosing  I   =  L  and  rejecting  the  input.   That  is, 


(3.3)    10  -  p^  -  1.9  >  -  pg,  i.e.  Pi  -  Pq  ^  8.1. 


So  (p   ~  Pr,)  must  lie  between  7.9  and  8.1. 

Now  the  seller  has  an  additional  option.   If  he  expects  the  buyer  to  set 

I   =  H,  he  can  choose  I   =  L  and,  given  that  8.1  >p   -p>7.9,  still  be 

confident  that  trade  of  "satisfactory"  input  will  occur  under  the  original 

contract  at  date  1  (the  buyer  will  accept  satisfactory  input  since  v  =  9  >  p^ 

-  p^,  while  the  seller  will  supply  it  since  p..  -  p^  >  7  =  c).   But  if  the 

seller  deviates,  his  payoff  rises  from  p   -  6  -  1.9  to  p   -  7.   (The  example 

is  symmetric  and  so  a  similar  deviation  is  also  profitable  for  the  buyer.) 

Hence  the  I   =  I   =  H  equilibrium  will  be  disrupted. 
B     S 

We  see,  then,  that  the  first-best  cannot  be  sustained  if  investment  and 

quality  are  both  noncontractible .   The  reason  is  that  it  will  be  in  the 

interest  of  the  seller  (resp.  the  buyer)  to  reduce  investment  since,  although 

this  reduces  social  benefit  by  lowering  the  buyer's  (resp.  seller's)  benefit, 

it  increases  the  seller's  (resp.  buyer's)  own  profit.   The  optimal  second-best 

contract  will  instead  have  I   =  H,  I   =  L  (or  vice-versa),  which  will  be 

B        S 

sustained  by  a  pair  of  prices  p  ,  p   such  that  9  >  p   -  P„  >  7 .   Total  surplus 

3 
will  be  .1  instead  of  the  first-best  level  of  .2. 

The  conclusion  is  that  inefficiencies  can  arise  in  incomplete  contracts 

even  though  the  parties  have  common  information  (both  observe  investments  and 

both  observe  quality) .   The  particular  inefficiency  that  occurs  in  the  model 

analyzed  is  in  ex-ante  investments.   Ex-post  trade  is  always  efficient 
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relative  to  these  investments  since  p  ,  p   can  and  will  be  chosen  such  that  v 
>  p   -  p   >  c,  i.e.  the  seller  wants  to  supply  and  the  buyer  to  receive 
satisfactory  input.   The  example  can  be  regarded  as  formalizing  the  intuition 
of  Williamson  (1985)  and  Klein,  Crawford  and  Alchian  (1978)  that  relationship- 
specific  investments  will  be  distorted  due  to  the  impossibility  of  writing 

complete  contingent  contracts  —  note  that  this  result  is  achieved  without 

4 
imposing  arbitrary  restrictions  on  the  form  of  the  permissible  contract. 

The  above  example  can  be  modified  to  illustrate  an  interesting 

possibility  that  can  arise  in  an  incomplete  contract.   Suppose  we  change  the 

I^  =  H,  I„  =  L  payoffs  in  Figure  1  from  (9,7)  to  (9,8.2)  and  the  I„  =  L,  I„  = 

bo  bo 

H  payoffs  to  (7.8,7).   The  first-best  stays  the  same.   But  now  it  can   be 

sustained  as  long  as  renegotiation  of  the  contract  is  impossible  at  date  1. 

In  particular,  choose  p  -  p  =  8.   Then  if  either  the  buyer  or  seller 

deviates  from  the  first-best,  v  >  p  -  p  >  c  will  be  violated  and  so  the 

deviating  party's  profit  will  fall  to  p   (for  the  seller)  or  -p   (for  the 

buyer) . 

However,  the  first-best  may  not  be  sustainable  if  renegotiation  is 

possible.   The  previous  argument  showing  that  7.9<p   -p<8.1  still 

applies.    Without  loss  of  generality,  set  p   =  0  in  the  following.   Suppose 

the  seller  chooses  I   =  L,  while  I   =  H.   Then  at  date  1,  the  parties  will 

S  B 

realize  that  since  7.8  =  v  <  p  ,  trade,  although  mutually  beneficial,  will  not 
occur  under  the  original  contract  (the  buyer  will  reject  the  input).   Hence 
they  will  presumably  lower  the  price  p   to  lie  between  7  and  7.8.   But  as  long 
as  the  new  price  p  '  >  7.2,  the  seller's  net  payoff  will  be  higher  than  if  he 
doesn't  deviate  (since  his  first-best  surplus  p  -  7.9  <  0.2).   Hence  the 
seller  will  deviate  unless  his  power  to  keep  p   up  in  the  renegotiation  is 
rather  limited.   (If  the  parties  split  the  gains  from  renegotiation  50:50,  p  ' 
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5 
=  7.4  and  the  seller  will  certainly  deviate.)    If  the  seller  is  a  "poor" 

bargainer,  however,  the  buyer  will  presumably  deviate,  i.e.  he  will  set  I   = 

B 

L,  anticipating  that  I   =  H  —  the  parties  will  then  agree  to  raise  the  price 
p   to  lie  between  c  =  8.2  and  v  =  9  and  the  buyer's  net  payoff  will  rise  as 
long  as  the  new  price  p  "  <  8.8. 

In  this  modified  example,  then,  the  buyer  and  seller  can  do  better  if 
they  can  precommit  themselves  not  to  renegotiate  the  contract!   We  have 
encountered  this  possibility  before  in  II. 4D,  but  note  that  the  method 
proposed  there  for  preventing  renegotiation  (in  the  static  case)  will  not  work 
here  (that  method  depended  on  the  worker  not  knowing  what  the  firm  was  going 
to  do  until  "the  last  moment",  whereas  here  both  parties  will  recognize  the 
need  for  renegotiation  as  soon  as  investment  decisions  are  made).   Simply 
putting  in  renegotiation  penalties  in  the  original  contract  (the  buyer  must 
pay  the  seller  a  million  dollars  if  there  is  renegotiation)  is  unlikely  to  be 
effective  since  the  parties  can  always  agree  to  rescind  the  old  contract, 
thereby  voiding  these  penalties  (see  Schelling  (I960)). 

If  renegotiation  cannot  be  prevented,  the  condition  that  there  are  no 
ex-post  Pareto  improvements  from  recontracting  must  be  imposed  as  a  constraint, 
in  the  original  contract  (as  in  Hart-Moore  (1985);  recall  that  in  the  present 
context  information  is  symmetric  and  so  it  is  clear  what  a  pareto  improvement 
is).   We  have  already  noted  that  such  a  constraint  may  be  important  in  dynamic 
asymmetric  information  labor  contract  models,  and  it  seems  to  apply  to  other 
contexts  too.   For  instance,  a  firm  may  wish  to  convince  a  customer  that  it's 
not  going  to  reduce  its  price  in  the  future,  but  a  binding  contract  to  that 
effect  may  be  infeasible  since  the  firm  and  customer  know  that  they  will  agree 
to  rescind  it  at  a  later  date  (an  agreement  not  to  raise  price,  on  the  other 
h5uid,  may  not  suffer  from  the  same  difficulty).   Other  examples  in  the  same 
spirit  may  be  found  in  Schelling' s  (1960)  interesting  discussion  of  the 
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7 
difficulties  of  making  copunitments . 

Returning  to  our  example,  we  may  illustrate  a  theory  of  ownership 
presented  in  Grossman-Hart  (1984,  1986).   It  is  sometimes  suggested  that  when 
transaction  costs  prevent  the  writing  of  a  complete  contract,  there  may  be  a 
reason  for  firm  integration  (see  Williamson  (1985)).   Consider  the  payoffs  of 
Figure  1  and  suppose  that  B  takes  over  S.   The  control  that  B  thereby  gains 
over  S's  assets  may  allow  B  to  affect  S's  costs  in  various  ways,  and  this  may 
reduce  the  possibility  of  opportunistic  behavior  by  S .   To  take  a  very  simple 
(and  contrived)  example,  suppose  that  if  S  chooses  I   =  L,  B  can  take  some 
action,  a,  with  respect  to  S's  assets  at  date  1  so  as  to  make  S's  cost  of 
supplying  either  satisfactory  or  unsatisfactory  input  equal  to  9  (in  the  coal- 
electricity  example,  a  might  refer  to  the  part  of  the  mine's  seam  the  coal  is 
taken  out  of;  note  that  we  now  drop  the  assumption  that  the  cost  of  supplying 
unsatisfactory  input  is  zero).   Imagine  furthermore  that  this  action  increases 
B's  benefit,  so  that  B  will  indeed  taice  it  at  date  1  if  S  chooses  L.   Then 
with  this  extra  degree  of  freedom,  the  first-best  can  be  achieved.   In 
particular,  if  P^  =  p   +  6.1,  1=1   =  H  is  a  Nash  equilibrium  since,  by  the 
above  reasoning,  any  deviation  by  the  seller  will  be  punished,  while  if  the 
buyer  deviates,  the  seller  will  supply  unsatisfactory  input  given  that  p   <  p 

Note  that  if  action  a  could  be  specified  in  the  initial  contract,  there 
would  be  no  need  for  integration:   the  initial  contract  would  simply  say  that 
B  has  the  right  to  choose  a  at  date  1.   Ownership  becomes  important,  however, 
if  (i)  a  is  too  complicated  to  be  specified  in  the  date  0  contract  and 
therefore  qualifies  as  a  residual  right  of  control;  and  (ii)  residual  rights 
of  control  over  an  asset  are  in  the  hands  of  whomever  owns  that  asset.   The 
point  is  that  under  incompleteness  the  allocation  of  residual  decision  rights 
matters  since  the  contract  cannot  specify  precisely  what  each  party's 
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obligations  are  in  every  state  of  the  world.   To  the  extent  that  ownership  of 
an  asset  guarantees  residual  rights  of  control  over  that  asset,  vertical  and 
lateral  integration  can  be  seen  as  ways  of  ensuring  particular  —  and 
presumably  efficient  —  allocations  of  residual  decision  rights.   (While  in 
the  above  example,  integration  increases  efficiency,  this  is  in  no  way  a 
general  conclusion.   In  Grossman-Hart  (1984),  (1986),  examples  are  presented 
where  integration  reduces  efficiency.) 

Before  concluding  this  section,  we  should  emphasize  that  for  reasons  of 
tractability  we  have  confined  our  attention  to  incompleteness  due  to  a  very 
particular  sort  of  transaction  cost.   In  practice,  some  of  the  other 
transactions  costs  we  have  alluded  to  are  likely  to  be  at  least  as  important, 
if  not  more  so.   For  example,  in  the  type  of  model  we  have  analyzed,  although 
the  parties  cannot  describe  the  state  of  the  world  or  quality  characteristics, 
they  are  still  supposed  to  be  able  to  write  a  contract  which  is  unambiguous 
and  which  anticipates  all  eventualities.   This  is  very  unrealistic.   In 
practice,  a  contract  might,  say,  have  B  agreeing  to  rent  S's  concert  hall  for 
a  particular  price.   But  suppose  S's  hall  then  burns  down.   The  contract  will 
usually  be  silent  about  what  is  meant  to  happen  under  these  conditions  (there 
is  no  hall  to  rent,  but  should  S  pay  B  damages  and  if  so  how  much?),  and  so, 
in  the  event  of  a  dispute,  the  courts  will  have  to  fill  in  the  "missing 
provision".   (A  situation  where  it  becomes  impossible  or  extremely  costly  to 
supply  a  contracted  for  good  is  known  as  one  of  "impossibility"  or 
"frustration"  in  the  legal  literature.)   An  analysis  of  this  sort  of 
incompleteness,  although  extremely  hard,  is  a  very  important  topic  for  future 
research.   It  is  likely  to  yield  a  much  richer  and  nore  realistic  view  of  the 
way  contracts  are  written  and  throw  light  on  how  courts  should  assess  damages 

(this  latter  issue  has  begun  to  be  analyzed  in  the  law  and  economics 

g 
literature;  see,  e.g.,  Shavell  (1980)). 
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III  .4   Self-Enforcing  Contracts 

The  previous  discussion  has  been  concerned  with  explicit  binding 
contracts  that  are  enforced  by  outsiders,  such  as  the  courts.   Even  the  most 
casual  empiricism  tells  us  that  many  agreements  are  not  of  this  type. 
Although  the  courts  may  be  there  as  a  last  resort  (the  shadow  of  the  law  may 
therefore  be  important) ,  these  agreements  are  enforced  on  a  day  to  day  basis 
by  custom,  good  faith,  reputation,  etc.   Even  in  the  case  of  a  serious 
dispute,  the  parties  may  take  great  pains  to  resolve  matters  themselves  rather 
than  go  to  court.   This  leads  to  the  notion  of  a  self-enforcing  or  implicit 
contract  (the  importance  of  informal  arrangements  like  this  in  business  has 
been  stressed  by  Macaulay  (1963)  and  Ben-Porath  (1980)  among  others). 

People  often  by-pass  the  legal  process  presumably  because  of  the 
transaction  costs  of  using  it.   The  costs  of  writing  a  "good"  long-term 
contract  discussed  in  III. 2  are  relevant  here.   So  also  is  the  skill  with 
which  the  courts  resolve  contractual  disputes.   If  contracts  are  incomplete 
and  contain  missing  provisions  as  well  as  vague  and  ambiguous  statements, 
appropriate  enforcement  may  require  abilities  and  knowledge  (what  was  in  the 
parties'  minds?)  that  many  judges  and  juries  do  not  possess.   This  means  that 
going  to  court  may  be  a  considerable  gamble  —  and  an  expensive  one  at  that. 
(This  is  an  example  of  the  fourth  transaction  cost  noted  in  III. 2.) 

Although  the  notion  of  implicit  or  self-enforcing  contracts  is  often 
invoked,  a  formal  study  of  such  agreements  has  begun  only  recently  (see,  e.g. 
Bull  (1985)),  with  a  considerable  stimulus  coming  from  the  theory  of  repeated 
games  (cf.  the  model  in  Section  1.7).   This  literature  has  stressed  the  role 
of  reputation  in  "completing"  a  contract.   That  is,  the  idea  is  that  a  party 
may  behave  "reasonably"  even  if  he  is  not  obliged  to  do  so  in  order  to  develop 
a  reputation  as  a  decent  and  reliable  trader.   In  some  instances  such 
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reputational  effects  will  operate  only  within  the  group  of  contractual  parties 
—  this  is  sometimes  called  internal  enforcement  of  the  contract  —  while  in 
others  the  effects  will  be  more  pervasive.   The  latter  will  be  the  case  when 
some  outsiders  to  the  contract,  e.g.  other  firms  in  the  industry  or  potential 
workers  for  a  firm,  observe  unreasonable  behavior  by  one  party,  and  as  a 
result  are  more  reluctant  to  deal  with  it  in  the  future.   In  this  case  the 
enforcement  is  said  to  be  external  or  market-based.   (The  model  of  1.7  uses 
the  idea  of  external  enforcement.)   Note  that  there  may  be  a  tension  between 
this  external  enforcement  and  the  reasons  for  the  absence  of  a  legally  binding 
contract  in  the  first  place  —  the  more  people  can  observe  the  behavior,  the 
more  likely  it  is  to  be  verifiable. 

The  distinction  between  an  incomplete  contract  and  a  standard  asymmetric 
information  contract  should  be  emphasized  here.   It  is  the  former  that  allows 
reputation  to  operate  since  the  parties  have  the  same  information  and  can 
observe  whether  reasonable  behavior  is  being  maintained.   In  the  latter  case, 
it's  unclear  how  reputation  can  overcome  the  asymmetry  of  information  between 
the  parties  that  is  the  reason  for  the  departure  from  an  Arrow-Debreu 
contract. 

The  role  of  reputation  in  sustaining  a  contract  can  be  illustrated  using 
the  following  model  (based  on  Bull  (1985)  and  Kreps  (1984);  this  is  an  even 
simpler  model  of  incomplete  contracts  than  that  of  the  last  section) .   Assume 
that  a  buyer,  B,  and  a  seller,  S,  wish  to  trade  an  item  at  date  1  which  has 
value  V  to  the  buyer  and  cost  c  to  the  seller,  where  v  >  c.   There  are  no  ex- 
ante  investments  and  the  good  is  homogeneous,  so  quality  is  not  an  issue. 
Suppose,  however,  that  it  is  not  verifiable  whether  trade  actually  occurs. 
Then  a  legally  binding  contract  which  specifies  that  the  seller  must  deliver 
the  item  and  the  buyer  must  pay  p,  where  v  >  p  >  c,  cannot  be  enforced.   The 
reason  is  that,  assuming  (as  we  shall)  that  simultaneous  delivery  and  payment 
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are  infeasible,  if  the  seller  has  to  deliver  first,  the  buyer  can  always  deny 
that  delivery  occurred  and  refuse  payment,  while  if  the  buyer  has  to  pay 
first,  the  seller  can  always  claim  later  that  he  did  deliver  even  though  he 
didn't.   As  a  result,  if  the  parties  must  rely  on  the  courts,  a  gainful 
trading  opportunity  will  be  missed. 

The  idea  that  not  even  the  level  of  trade  is  verifiable  is  extreme,  and 
Bull  (1985)  in  fact  makes  the  more  defensible  assumption  that  it's  the  quality 
of  the  good  that  can't  be  verified  (in  Bull's  model,  S  is  a  worker  and  quality 
refers  to  his  performance) .   Bull  supposes  that  quality  is  observable  to  the 
buyer  only  with  a  lag,  so  that  take  it  or  leave  it  offers  of  the  type 
considered  in  the  last  section  aren't  feasible.   As  a  result  the  seller  always 
has  an  incentive  to  produce  rainlnum  quality  (which  corresponds  in  the  above 
model  to  zero  output) .   Making  quantity  nonverif iable  is  a  cruder  but  simpler 
way  of  capturing  the  same  idea  (this  is  the  approach  taken  in  Kreps  (1984)). 

Note  that  in  the  above  model  incompleteness  of  the  contract  arises 
entirely  from  transaction  cost  (3),  the  difficulty  of  writing  and  enforcing 
the  contract. 

To  introduce  reputational  effects  one  supposes  that  this  trading 
relationship  is  repeated.   Bull  (1985)  and  Kreps  (1984)  follow  the  supergame 
literature  and  assume  infinite  repetition  in  order  to  avoid  unravelling 
problems.   This  approach,  as  is  well  known,  suffers  from  a  number  of 
difficulties.   First,  the  assumption  of  infinite  (or  in  some  versions, 
potentially  infinite)  life  is  hard  to  swallow.   Secondly,  "reasonable" 
behavior,  i.e.  trade,  is  sustained  by  the  threat  that  if  one  party  behaves 
unreasonably  so  will  the  other  party  from  then  on.   While  this  threat  is 
"credible"  (more  precisely,  subgame  perfect),  it's  unclear  why  the  parties 
couldn't  decide  to  continue  to  trade  after  a  deviation,  i.e.  to  "let  bygones 
be  bygones".   (See  Farrell  (1984);  this  is  another  example  where  the  ability 


105 

to  renegotiate  ex-post  hurts  the  parties  ex-ante.) 

It  would  seem  that  a  preferable  approach  is  to  assume  that  the 
relationship  has  finite  length,  but  introduce  asymmetric  information,  as  in 
Kreps-Wilson  (1982)  and  Milgrom-Roberts  (1982).   The  following  is  based  on 
some  very  preliminary  work  that  we  have  undertaken  along  these  lines. 

Suppose  that  there  are  two  types  of  buyers  in  the  population,  honest  and 
dishonest.  Honest  buyers  will  always  honor  any  agreement  or  promise  that  they 
have  made  while  dishonest  ones  will  do  so  only  if  this  is  profitable.  A  buyer 
knows  his-  own  type,  but  others  do  not.  It  is  common  knowledge  that  the 
fraction  of  honest  buyers  in  the  population  is  tt,  0  <  tt  <  1 .  In  contrast,  all 
sellers  are  known  to  be  dishonest.   All  agents  are  risk  neutral. 

Assume  for  simplicity  that  a  single  buyer  and  seller  are  matched  at  date 
0  with  neither  having  any  alternative  trading  partners  at  this  date  or  in  the 
future  (we  are  here  departing  from  the  ex-ante  perfect  competition  story  that 
we  have  maintained  for  most  of  the  paper).   Consider  first  the  one  period 
case.   Then  a  date  0  agreement  can  be  represented  as  follows. 


Pi  S  P2 

I  II         III 


/Figure  2 


The  interpretation  is  that  the  buyer  promises  to  pay  the  seller  p  before  date 
1  (stage  I);  in  return,  the  seller  promises  to  supply  the  item  at  date  1 
(stage  II);  and  in  return  for  this,  the  buyer  promises  to  make  a  further 
payment  of  p   (stage  III). 
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We  should  mention  one  further  assumption.   Honest  buyers,  although  they 
never  breach  an  agreement  first,  are  supposed  to  feel  under  no  obligation  to 
fulfil  the  terms  of  an  agreement  that  has  already  been  broken  by  a  seller 
(interestingly,  although  this  is  a  theory  of  buyer  psychology,  it  has 
parallels  in  the  common  law) .   Note  that  if  a  buyer  ever  breaks  an  agreement 
first,  he  reveals  himself  to  be  dishonest,  with  the  consequence  that  no 
further  self-enforcing  agreement  with  the  seller  is  possible  and  hence  trade 
ceases . 

What  is  an  optimal  agreement?   Consider  figure  2.   The  seller  knows  that 
he  will  receive  p  only  with  probability  tt  since  a  dishonest  buyer  will 
default  at  the  last  stage.   Since  the  seller  is  himself  dishonest,  he  will 
supply  at  Stage  II  only  if  it  is  profitable  for  him  to  do  so,  i.e.  only  if 


(3.4)   TTp^  -  c  >  0, 


Assume  for  simplicity  that  the  seller  has  all  the  bargaining  power  at  date  0 
(nothing  that  follows  depends  on  this).   Then  the  seller  will  wish  to  maximize 
his  overall  payoff 


(3.5)    p^  +  -np^  -  c. 


subject  to  (3.4)  which  makes  it  credible  that  he'll  supply  at  stage  II  and 
also  the  constraint  that  he  does  not  discourage  an  honest  buyer  from 
participating  in  the  agreement  at  date  0.   Since  with  (3.4)  satisfied,  buyers 
know  that  they  will  receive  the  item  for  sure,  this  last  condition  is 


(3.6)    V  -  p^  -  p^  >  0, 
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Note  that  a  dishonest  buyer's  payoff  v  -  p   is  always  higher  than  an  honest 
buyer's  payoff  given  in  (3.6),  so  there  is  no  way  to  screen  out  dishonest 
buyers.   In  the  language  of  asymmetric  information  models,  the  equilibrium  is 
a  pooling  one. 

Since  the  seller's  payoff  is  increasing  in  p  ,  (3.6)  will  hold  with 
equality  (the  buyer  gets  no  surplus).   (More  generally,  changes  in  p   simply 
redistribute  surplus  between  the  two  parties  without  changing  either 's 
incentive  to  breach.)   If  we  substitute  for  p   in  (3.5),  the  seller's  payoff 
becomes  v  -  P-(l  -  tt)  -  c,  which,  when  maximized  subject  to  (3.4),  yields  the 
solution  p-  =  -.   The  maximized  net  payoff  is 

2     TT 


(3.7)    V  -  ^, 


which  is  less  than  the  first-best  level,  v  -  c. 

We  see  then  that  the  conditions  for  trade  are  more  stringent  in  the 
absence  of  a  binding  contract.   If  -  >  v  >  c,  there  are  gains  from  trade  which 
won't  be  realized  in  a  one  period  relationship. 

Suppose  now  that  the  relationship  is  repeated.   Consider  a  two  period 
version  of  the  above  and  assume  no  discounting.   Now  the  following  diagram 
applies: 


P,         S  p^  S         P3 

I  II         III  IV         V 


Figure  3 
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That  is,  the  agreement  says  that  the  buyer  pays,  the  seller  supplies  the 
first  time,  the  buyer  pays  more,  the  seller  supplies  a  second  time,  and  the 
buyer  makes  a  final  payment.   Rather  than  solving  for  the  optimal  arrangement, 

we  shall  simply  show  that  the  seller  can  do  better  than  in  the  one  period 

c  c 

case.   Let  p„  =  -,  p„  =  c  and  p^  =  2v  -  c  -  - .   Then  (i)  the  seller  will 

3     TT     2  1  IT 

supply  at  Stage  IV  (if  matters  have  got  that  far),  knowing  that  he  will 
receive  p  with  probability  it;  (ii)  both  honest  and  dishonest  buyers  will  pay 
p   at  Stage  III,  the  latter  because,  at  a  cost  of  c,  they  thereby  ensure 
supply  worth  v  >  c  at  Stage  IV;  (iii)  the  seller  will  supply  at  stage  II 
because  this  gives  him  a  net  payoff  of  p   +  np  -  2c  S  0,  while  if  he  doesn't 
the  arrangement  is  over  and  his  payoff  is  zero;  (iv)  aji  honest  buyer  is 
prepared  to  participate  since  his  surplus  is  nonnegative  (actually  zero). 
The  seller's  overall  expected  net  payoff  is 


(3.8)    Pi  -^  P2  "^  ^Ps  -  2c  =  2v  -  c  -  ^, 


which  exceeds  twice  the  one  period  payoff.   Hence  trade  is  more  likely  to  take 
place  in  a  two  period  relationship  than  in  a  one  period  one.   In  fact  it  can 
be  shown  that  the  above  is  an  optimal  two  period  agreement. 

Repetition  improves  things  by  allowing  the  honest  buyer  to  pay  less 
second  time  round  (Stage  III)  than  third  time  round  (Stage  V).   That  is,  the 
arrangement  back-loads  payments.   This  is  acceptable  to  the  seller  because  he 
knows  that  even  a  dishonest  buyer  will  not  default  at  Stage  III  since  he  has  a 
large  stake  in  the  arrangement  continuing.   To  put  it  another  way,  the 
dishonest  buyer  doesn't  want  to  reveal  his  dishonesty  at  too  early  a  stage. 

The  same  arrangement  can  be  used  when  there  are  more  than  two  periods: 
the  buyer  promises  to  pay  c  at  every  stage  except  the  last,  when  he  pays(c/TT). 
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In  fact  the  per  period  surplus  of  the  seller  from  such  an  arrangement 
converges  to  the  first-best  level  (v-c)  as  the  number  of  periods  tends  to  "» 
(assuming  no  discounting,  of  course). 

Although  the  above  analysis  is  extremely  provisional  and  sketchy,  we  can 
draw  some  tentative  conclusions  about  the  role  of  reputation  and  indicate  some 
directions  for  further  research.   First,  the  notion  of  a  psychic  cost  of 
breaking  an  agreement  seems  to  be  a  useful  —  as  well  as  a  not  unrealistic  — 
basis  for  a  theory  of  self-enforcing  contracts.   It  is  obviously  desirable  to 
drop  the  assumption  that  some  agents  are  completely  honest  and  others 
completely  dishonest,  and  assume  instead  that  the  typical  trader  has  a  finite 
psychic  cost  of  breaking  an  agreement,  where  this  cost  is  distributed  in  the 
population  in  a  known  way.   In  other  words,  everybody  "has  their  price",  but 
this  price  varies.   Preliminary  work  along  these  lines  suggests  that  the  above 
results  generalize;  in  particular,  repetition  makes  it  easier  to  sustain  a 
self -enforcing  agreement. 

Of  course,  asymmetries  of  information  about  psychic  costs  are  not  the 
only  possible  basis  for  a  theory  of  reputation.   For  example,  the  buyer  and 
seller  could  have  private  information  about  v  and  c,  and  might  choose  their 
trading  strategies  to  influence  perceptions  about  the  values  of  these 
variables.   A  theory  of  self-enforcing  contracts  should  ideally  generate 
results  which  are  not  that  sensitive  to  where  the  asymmetry  of  information  is 
placed.   The  work  of  Fudenberg-Haskin  (1984)  in  a  related  context,  however, 
suggests  that  this  may  be  a  difficult  goal  to  achieve. 

There  are  a  number  of  other  natural  directions  in  which  to  take  the 
model.   One  is  to  introduce  trade  with  other  parties.   For  example,  the  seller 
may  trade  with  a  succession  of  buyers  rather  than  a  single  one.   The  extent  to 
which  repetition  increases  per  period  surplus  in  this  case  depends  on  whether 
new  buyers  observe  the  past  broken  promises  of  the  seller.   (This  determines 
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the  degree  to  which  external  enforcement  operates;  more  generally,  a  new  buyer 
may  observe  that  default  occurred  in  the  past,  but  be  unsure  about  who  was 
responsible  for  it.)   If  new  buyers  don't  observe  past  broken  promises, 
repetition  achieves  nothing,  which  gives  a  very  strong  prediction  of  the 
possible  benefits  of  a  long-term  relationship  between  a  fixed  buyer  and 
seller.   Even  if  past  broken  promises  are  observed  perfectly,  it  appears  that, 
ceteris  paribus,  a  single  long-term  agreement  may  be  superior  to  a  succession 
of  short-term  ones.   The  reason  is  that  in  the  latter  case  the  constraint  is 
imposed  that  each  party  must  receive  nonnegative  surplus  over  their  term  of 
the  relationship  whereas  in  the  former  case  there  is  only  the  single 
constraint  that  surplus  must  be  nonnegative  over  the  whole  term  (see  Bull 
(1985),  Kreps  (1984)). 

Probably  the  most  important  extension  is  to  introduce  incompleteness  due 
to  other  sorts  of  transaction  costs,  e.g.  the  "bounded  rationality"  costs  (1) 
and  (2)  discussed  in  III. 2.   The  problem  is  that  the  same  factors  which  meike 
it  difficult  to  anticipate  and  plan  for  eventualities  in  a  formal  contract 
apply  also  to  informal  arrangements.   That  is,  ein  informal  arrangement  is  also 
likely  to  contain  many  "missing  provisions".   But  then  the  question  arises, 
what  constitutes  "reasonable"  or  "desirable"  behavior  (in  terms  of  building  a 
reputation)  with  regard  to  states  or  actions  that  weren't  discussed  ex-ante? 
Custom,  among  other  things,  is  likely  to  be  important  under  these  conditions: 
behavior  will  be  "reasonable"  or  "desirable"  to  the  extent  that  it  is 
generally  regarded  as  such  (for  a  good  discussion  of  this,  see  Kreps  (1984)). 
This  raises  memy  new  and  interesting  (as  well  as  extremely  difficult) 
questions. 

Even  though  our  analysis  of  reputation  is  very  preliminary,  it  can  throw 
some  light  on  the  ABG  implicit  contract  model.   There  the  firm  insures  the 
workers  against  fluctuations  in  their  marginal  product  of  labor.   Uncertainty 
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and  risk  aversion  will  obviously  complicate  the  analysis  of  self-enforcing 
agreements  considerably,  but  the  above  results  suggest  that  a  long-term 
agreement  which  stabilizes  the  workers'  net  income  may  be  sustainable  even  in 
the  absence  of  a  binding  contract,  particularly  if  trade  is  repeated. 
Moreover,  this  can  be  so  even  if  the  marginal  product  of  labor  is  (perfectly) 
correlated  over  time  (in  the  above  model  it's  constant),  which  suggests  that 
an  implicit  contract  may  be  sustained  also  for  the  asymmetric  information  case 
studied  in  III. 3  (correlation  of  the  marginal  product  is  important  because,  in 
its  absence,  the  asymmetry  of  information  may  disappear  asymptotically;  see 
II. 4C).   With  strong  correlation,  however,  the  conditions  for  an  implicit 
contract  will  be  more  stringent  since  a  firm  that  has  had  a  bad  draw  —  and 
knows  that  this  is  permanent  —  will  have  a  stronger  incentive  to  breach  (see 
Newbery-Stiglitz  (1983)).   More  generally,  the  fact  that  a  contract  must  be 
self -enforcing  will  impose  constraints  on  the  form  that  it  can  take.   An 
analysis  of  the  precise  conditions  under  which  implicit  contracts  can  be 
sustained,  and  their  resulting  characteristics,  when  there  is  risk  aversion 
and  asymmetric  information  seems  an  interesting  and  important  topic  for  future 
research. 

Ill  .5   Summary  and  Conclusions 

The  vast  majority  of  the  theoretical  work  on  contracts  to  date  has  been 
concerned  with  what  might  be  called  "complete"  contracts.   In  this  context,  a 
complete  contract  means  one  that  specifies  each  party's  obligations  in  every 
conceivable  eventuality,  rather  than  a  contract  that  is  fully  contingent  in 
the  Arrow-Debreu  sense.   According  to  this  terminology,  the  asymmetric 
information  labor  contracts  of  I I. 3  are  just  as  complete  as  the  symmetric 
information  ones  of  II.l. 
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In  reality  it  is  usually  impossible  to  lay  down  each  party's  obligations 
completely  and  unambiguously  in  advance,  and  so  most  actual  contracts  are 
seriously  incomplete.   In  Part  III,  we  have  tried  to  indicate  some  of  the 
complications  of  such  incompleteness.   Among  other  things,  we  have  seen  that 
incompleteness  can  lead  to  departures  from  the  first-best  even  when  there  are 
no  asynunetries  of  information  among  the  contracting  parties  (and,  moreover, 
the  parties  are  risk  neutral). 

More  important  perhaps  than  this  is  the  fact  that  incompleteness  raises 
new  and  difficult  questions  about  how  the  behavior  of  the  contracting  parties 
is  determined.   To  the  extent  that  incomplete  contracts  do  not  specify  the 
parties'  actions  fully,  i.e.  they  contain  "gaps",  additional  theories  are 
required  to  tell  us  how  these  gaps  are  filled  in.   Among  other  things,  outside 
influences  such  as  custom  or  reputation  may  become  important  under  these 
conditions.   In  addition,  outsiders,  such  as  the  courts  (or  arbitrators),  may 
have  a  role  to  play  in  filling  in  missing  provisions  of  the  contract  and 
resolving  ambiguities  rather  than  in  simply  enforcing  an  existing  agreement. 
Incompleteness  can  also  throw  light  on  the  importance  of  the  allocation  of 
decision  rights  or  rights  of  control.   If  it  is  too  costly  to  state  precisely 
how  a  particular  asset  is  to  be  used  in  every  state  of  the  world,  it  may  be 
efficient  simply  to  give  one  party  "control"  of  the  asset,  in  the  sense  that 
he  is  entitled  to  do  what  he  likes  with  it,  subject  perhaps  to  some  explicit 
(contractible)  limitations. 

While  the  importance  of  incompleteness  is  very  well  recognized  by 
lawyers,  as  well  as  by  those  working  in  law  and  economics,  it  is  only 
beginning  to  be  appreciated  by  economic  theorists.   It  is  to  be  hoped  that 
work  in  the  next  few  years  will  lead  to  significant  advances  in  our  formal 
understanding  of  this  phenomenon.   Unfortunately,  progress  is  unlikely  to  be 
easy  since  many  aspects  of  incompleteness  are  intimately  connected  to  the 
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notion  of  bounded  rationality,  a  satisfactory  formalization  of  which  doesn't 
yet  exist. 

As  a  final  illustration  of  the  importance  of  incompleteness,  consider 
the  following  question.   Why  do  parties  frequently  write  a  limited  term 
contract,  with  the  intention  of  renegotiating  this  when  it  comes  to  an  end, 
rather  than  writing  a  single  contract  that  extends  over  the  whole  length  of 
their  relationship?   In  a  complete  contract  framework  such  behavior  cannot  be 
advantageous  since  the  parties  could  just  as  well  calculate  what  will  happen 
when  the  contract  expires  and  include  this  as  part  of  the  original  contract. 
It  is  to  be  hoped  that  future  work  on  incomplete  contracts  will  allow  this 
very  basic  question  to  be  answered. 
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Footnotes  to  Part  I 

1.  Grossman  and  Hart  (1983)  offer  a  set  of  sufficient  conditions  for 
existence.   A  key  condition  is  that  the  probabilities  that  the  agent 
controls  are  bounded  away  from  zero. 

2.  Our  discussion  of  the  optimal  incentive  scheme  would  not  materially 
change  by  assuming  that  the  principal  is  risk  averse.   Only  the  left- 
handside  of  (4)  would  change  to  v' (x-s(x) )/u ' (s(x) ) .   We  could  also  have 
imposed  constraints  on  the  agent's  wealth  so  that  s(x)  >   w  and  (4)  would 
remain  intact  with  this  constraint  effective  whenever  s(x)  <  w  in  (4). 
The  case  of  a  wealth  constraint  is  of  some  economic  interest  though.   If 
the  wealth  constraint  is  binding  it  may  force  the  agent  to  receive  more 
than  u.   The  economic  intuition  is  that  if  the  agent  cannot  be  punished 
sufficiently  to  induce  him  to  choose  H,  then  a  bribe  -  extra  rewards  for 
good  outcomes  —  will  be  the  only  alternative.   These  rewards  may  well 
lead  to  slack  in  (2)  as  Becker  and  Stigler  (1974)  first  noted. 
Subsequently,  Shapiro  and  Stiglitz  (1984)  have  used  this  feature  to 
study  the  efficiency  wage  hypothesis,  a  theory  of  underemployment 
arising  from  the  difference  between  compensation  and  opportunity  cost. 

3.  Alternatively,  of  course,  one  can  work  with  any  one-parameter  family 
(for  which  a  solution  is  known  to  exist)  and  then  interpret  the 
characterization  as  referring,  not  to  this  family  necessarily,  but  to 
the  tangent  space  of  distributions  described  by  (9). 

4a.    Grossman  and  Hart  (1983)  study  cases  in  which  the  first  order  approach 
may  not  be  applicable.   Even  with  MLRP ,  incentive  schemes  need  not  be 
monotone.   On  the  other  hand,  the  result  that  sufficient  statistics  are 
sufficient  for  designing  optimal  incentive  schemes  does  not  depend  on 
the  first  order  approach.   Also,  a  more  informative  system  (in  the 
Blackwell  sense)  is  strictly  better  than  a  less  informative  one, 
assuming  that  the  garbling  matrix  that  connects  the  two  systems  has  full 
rank.   However,  signals  that  provide  additional  information  about  the 
agent's  strategy  may  not  be  valuable  when  the  first  order  approach 
fails . 
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4b.    Hidden  Information  Models,  viewed  in  distribution  space,  are  typically 
of  high  dimension,  because  contingent  strategies  result  in  rich 
distributional  choices  for  the  agent  (see  section  1.6).   This  is  why  the 
analysis  of  Hidden  Information  Models  proceed  along  quite  different 
lines  than  the  analysis  of  Hidden  Action  Models. 

4c.    Share-cropping  rules  are  almost  exclusively  linear  despite  great 
variations  in  stochastic  environments. 

5.  We  remind  the  reader  of  our  discussion  of  explicit  versus  implicit 
incentive  schemes  in  the  introduction.   Some  would  argue  that  real  world 
schemes  are  quite  complex,  viewed  as  equilibrium  phenomena. 

6.  This  could  be  one  reason  for  the  prevalence  of  linear  sharing  rules  in 
share-cropping.   It  may  also  explain  why  corporate  tax  schemes  are  more 
linear  than  income  taix  schemes.   Presumably,  corporations  can  circumvent 
non-linearities  in  tEix  schemes  more  easily  than  individuals.   (Some 
would  argue  that  individuals  can  do  a  lot  of  arbitrage  as  well,  making 
income  tax  a  lot  less  progressive  than  it  appears.) 

7.  Harris  and  Raviv  (1979)  study  optimal  forcing  contracts. 

8.  This  can  be  simply  illustrated  in  the  case  of  a  risk-neutral  agent. 
Than  an  infinity  of  schemes  will  be  first-best.   They  include  a  linear 
scheme  with  unitary  slope  as  well  as  the  aforementioned  step-function. 
However,  if  the  agent  gets  some  noisy  information  about  the  technology 
before  choosing  his  effort,  the  linear  scheme  will  be  uniquely  optimal. 
This  idea  is  used  in  Laffont  and  Tirole  (1986). 

9.  We  venture  the  guess  that  in  multi-dimensional  agency  models  additional 
signals  are  valuable  precisely  when  they  give  information  about 
dimensions  of  choice  in  which  there  is  a  conflict  of  interest.   In  one- 
dimensional  models  there  is  a  conflict  of  interest  always  (by 
assumption).   The  result  that  additional  information  has  value  if  it  is 
informative  is  true  always  in  that  case. 
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10.  It  is  worth  noting  that  in  this  example  the  agent  could  privately 
manufacture  the  optimal  degree  of  relative  performance  evaluation  by 
trading  in  other  firms'  assets.   In  other  words,  the  principal  could 
equally  well  pay  the  agent  based  on  x  alone  and  leave  it  up  to  the  agent 
to  filter  out  uncontrollable  risk.   {Of  course,  the  agent  must  not  be 
allowed  to  short-sell  stock  in  his  own  firm.) 

11.  The  models  are  different  in  some  other  respects  as  well.   Malcomson  and 
Spinnewyn  consider  a  finite  horizon  with  a  general  utility  function, 
while  Fudenberg  et.  al.  consider  the  infinitely  repeated  discounted  case 
with  an  exponential  utility  function  for  the  agent.   The  exponential 
assumption  does  not  appear  to  be  esential,  however.   Its  main  advantage 
is  that  the  optimal  sequence  of  short  term  contracts  is  simply  the 
optimal  one-period  contract  repeated  in  each  period.   With  a  general 
utility  function  this  will  not  be  the  case,  because  the  agent's  wealth 
level  will  be  changing.   Note  that  this  means  that  even  with  a  sequence 
of  short-term  contracts  memory  will  play  a  role,  since  contracts  will  be 
contingent  on  the  past  implicitly  in  equilibrium. 

12.  A  related  reputation  model  concerning  risk  taking,  which  derives  very 
interesting  predictions  about  the  nature  of  debt  contracts  and  credit 
rating  in  capital  markets,  is  in  D.  Diamond  (1985). 

13.  A  somewhat  different  dimension  of  the  same  problem  appears  when  a  party 
contracts  with  many  independent  agents  in  a  decentralized  fashion.   This 
has  been  recently  looked  at  by  Cremer  and  Riordan  (1986),  but  it 
deserves  much  more  attention. 
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Footnotes  to  Part  II 

1.  On  the  empirical  importance  of  such  relationships,  see  Hall  (1980). 

2.  In  a  more  general  model,  the  size  of  the  workforce  would  be  a  choice 
variable. 

3.  Two  assumptions  are  embodied  here.   First  that  p  is  independent  of  the 
shock  s  hitting  the  firm;  and,  secondly,  that  the  firm  and  workers  are 
sufficiently  small  that  their  actions  do  not  affect  prices.   We  shall 
maintain  both  assumptions  throughout  Parts  II  and  III. 

4.  The  reason  is  the  following.   In  a  spot  market,  a  worker's  incentive  to 
work  hard  in  a  good  state  where  the  wage  rate  is  high  (the  substitution 
effect)  will  be  offset  by  his  desire  to  consume  a  lot  of  leisure  given 
that  his  income  is  high  (the  income  effect);  and  conversely  in  a  bad 
state.   In  a  contractual  setting,  the  income  effect  is  reduced  in  size 
because  the  firm  provides  income  insurance  across  different  states  of 
the  world. 

5.  We  have  assumed  that  the  firm  is  risk  neutral,  but  the  main  results 
generalize  to  the  case  of  firm  risk  aversion.   In  particular,  as  long  as 
the  firm  is  "less  risk  averse"  than  the  workers,  workers'  incomes  will 
be  stabilized  relative  to  the  spot  market  outcome.   Note  also  that  (2.2) 
continues  to  hold  when  the  firm  is  risk-averse. 

6.  Although  the  Knightian  argument  can  be  made  that  entrepreneurs  are,  by 
self -selection,  less  risk-averse  than  workers.   For  a  formalization,  see 
Kihlstrom-Laffont  (1979). 

7.  There  is  an  obvious  parallel  between  Holmstrom's  theory  and  Becker's 
(1964)  analysis  of  worker  training. 

8.  It  is  also  worth  pointing  out  that  various  forms  of  disguised  exit  fees 
may  actually  be  quite  common;  consider,  e.g.,  non-vested  pensions. 
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9.  Note  that  deposits  are  used  in  some  contexts;  consider,  for  instance, 
rental  deposits. 

10.  If  the  party  with  private  information  is  risk-neutral,  the  first-best 
can  be  achieved  by  making  this  party  the  residual  income  claimant. 

11.  The  analysis  below  follows  Hart  (1983). 

12.  A  more  general  contract  would  make  the  outcome  {!.,  L.)  depend 
stochastically  on  the  report  s..   Such  random  contracts  are  more 
complicated  to  analyze  and,  at  least  for  the  two  state  case  considered 
here,  do  not  lead  to  substantially  different  results.   On  random 
schemes,  see  Maskin-Riley  (1984)  and  Moore  (1985). 

13.  The  first-best  could  be  achieved  if  the  majiager  were  risk  neutral,  since 
in  this  case  no  insurance  is  required  at  all,  i.e.,  I  =1=0  and  L.  = 
L(s.),  i  =  1,2,  which  satisfies  the  truth-telling  constraints. 

14.  Some  versions  of  the  model  assume  instead  that  the  manager  is  risk 
neutral  but  cannot  have  negative  net  income  (see,  e.g..  Farmer  (1985)). 
This  amounts  to  a  form  of  risk  aversion,  however,  since  it  is  equivalent 
to  supposing  that  negative  net  income  gives  the  manager  a  utility  of 
minus  infinity. 

15.  Feldstein's  (1976)  work  suggests  that  surprisingly  many  layoffs  are  in 
fact  temporary. 

16.  It  is  worth  noting  that  utility  functions  that  give  rise  to 
overemployment  predict  that  workers  will  be  better  off  in  low  employment 
states  than  high  employment  ones.   While  this  may  be  plausible  in  the 
case  of  very  short-run  employment  changes,  e.g.  overtime,  it  seems  much 
less  realistic  for  longer-run  changes,  e.g.  severances. 

17.  This  is  of  course  the  same  confusion  that  Lucas  (1972)  exploited. 

18.  (2.21)  is  simply  Akerlof-Miyazaki ' s  (1980)  wage  bill  argument.   Note 
that  the  conclusion  that  an  optimal  contract  must  satisfy  (2.21) 
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generalizes  to  the  case  where  the  firm  is  risk-averse,  since  a  risk- 
averse  firm  also  cares  only  about  the  size  and  not  about  the  division  of 
the  wage  bill  in  a  particular  state. 

19.  Azariadis  was  able  to  explain  involuntary  layoffs  in  his  original  (1975) 
paper,  but  only  by  making  the  arbitrary  assumption  that  layoff  pay  is 
zero. 

20.  A  third  approach  is  to  focus  on  the  costly  search  process  that  laid-off 
workers  must  engage  in  to  find  a  new  job  {see,  e.g.,  Arnott-Hosios- 
Stiglitz  (1985)).   It  is  clear  that  workers  will  not  be  provided  with 
the  right  incentives  to  search  if  they  are  guaranteed  a  fixed  utility 
level,  independently  of  whether  they  find  new  employment.   However, 
since  a  firm  can  preserve  incentives  by  giving  a  departing  worker  a  lump 
sum  payment,  it  does  not  follow  from  this  that  laid-off  workers  will  be 
worse  off  than  retained  workers.   In  fact  the  results  on  this  are 
ambiguous . 

21.  A  similar  phenomenon  arises  in  a  dynamic  bargaining  context  where  a 
seller  would  like  to  commit  himself  to  make  a  single  take  it  or  leave  it 
offer  to  a  buyer,  but  cannot  do  so  since  he  cannot  constrain  himself  not 
to  make  a  second  offer  if  his  first  offer  is  rejected.   See,  e.g., 
Fudenberg-Tirole  (1983).   Note  that  there  is  a  fundamental  difference 
between  all  the  parties  agreeing  to  tear  up  the  contract  and  one  party 
repudiating  the  contract  —  something  which  we  have  implicitly  assumed 
never  occurs,  e.g.,  because  the  resulting  damage  payment  is  so  large. 

22.  A  start  on  this  has  been  made  by  Dewatripont  (1985). 
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Footnotes  to  Part  III 

1.  For  example,  suppose  that  B  does  not  have  to  make  any  investment,  but 
that  his  benefit  from  the  input  is  stochastic:   b  =  10  with  probability 
1/2  and  3  with  probability  1/2.   Assume  that  B  learns  the  exact  value  of 
b  at  date  1  while  S  does  not,  that  c  =  0  for  sure  and  that  both  parties 
are  risk  neutral.   Then  if  bargaining  occurs  from  scratch  at  date  1,  and 
S  has  the  power  to  make  take  it  or  leave  it  offers,  he  will  set  a  price 
of  10  (obviously  S  will  not  find  it  profitable  to  set  a  price  other  than 
10  or  3;  the  price  of  10  gives  him  higher  expected  profit).   But  this 
means  that  a  mutually  beneficial  trade  will  not  be  made  in  the  event  b  = 
3.   On  the  other  hand,  the  first-best  can  be  achieved  by  a  long-term 
contract  which  specifies  that  the  buyer  can  insist  on  supply  of  the 
input  in  all  circumstances  at  some  predetermined  price. 

2.  In  some  cases,  the  courts  will  not  enforce  such  an  agreement,  taking  the 
point  of  view  that  the  parties  could  not  really  have  intended  it  to 
apply  unchanged  for  such  a  long  time.   A  clause  to  the  effect  that  the 
parties  really  do  mean  what  they  say  should  be  enough  to  overcome  this 
difficulty,  however.   In  other  cases,  it  may  be  impossible  to  write  a 
binding  long-term  contract  because  the  identities  of  some  of  the  parties 
involved  may  change.   For  example,  one  party  may  be  a  government  that  is 
in  office  for  a  fixed  period,  and  it  may  be  impossible  for  it  to  bind 
its  successors.   This  latter  idea  underlies  the  work  of  Kydland-Prescott 
(1977)  and  Freixas-Guesnerie-Tirole  (1985). 

3.  It  is  worth  pointing  out  why  we  have  assumed  that  both  the  buyer  and 
seller  make  investments.   If  only  the  buyer  (resp.  the  seller)  invests, 
the  first-best  can  be  achieved  by  choosing  p   -  p^  between  6  and  7 
(resp.  9  and  10):   any  deviation  by  the  buyer  (resp.  the  seller)  will 
then  be  unprofitable  since  it  will  lead  to  no  trade.   This  argument 
depends  on  the  assumption  of  no  renegotiation  of  the  contract  at  date  1, 
an  issue  we  deal  with  below.   However,  even  if  renegotiation  is  allowed, 
the  first-best  can  be  achieved  with  one-sided  investment  by  a  contract 
which  fixes  p   but  gives  the  investing  party  the  power  to  choose  any  p 
he  wants.   This  party  then  faces  the  social  net  benefit  function  since 
he  extracts  all  the  surplus. 
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4.  The  inefficiency  that  we  have  identified  may  not  seem  that  surprising 
given  that  our  model  resembles  that  found  in  the  moral  hazard  in  teams 
literature  (see,  e.g.,  Holmstrom  (1982a)).   In  that  literature,  each 
agent  takes  a  private  action  that  affects  total  benefits;  in  our  model, 
investment  decisions  have  this  property.   However,  there  are  some 
differences  between  the  frameworks.   First,  in  our  context,  the  agents 
observe  each  other's  actions.   Secondly,  the  externality  in  investments 
only  materializes  in  the  event  that  trade  occurs,  and  so  the  terms  of 
trade  can  be  used  to  mitigate  the  externality.   In  any  case,  our  purpose 
is  not  the  development  of  a  new  model,  but  rather  the  application  of  it 
to  a  new  context  —  the  analysis  of  the  consequences  of  incomplete 
contracting. 

5.  In  fact  Hart-Moore  (1985)  give  an  argument  that  the  seller  will  be 
strongly  advantaged  in  a  renegotiation  involving  a  price  decrease,  and 
that  p^'  =  Pq  +  7.8. 

6.  The  inclusion  of  a  third  party  in  the  contract  —  with  the  initial  two 
parties  promising  to  pay  the  third  party  a  large  sum  of  money  if  they 
ever  renegotiate  —  also  does  not  overcome  the  problem  since,  if  there 
are  ex-post  gains  from  renegotiation,  the  third  party  can  be  persuaded 
at  date  1  to  give  up  his  claim  to  this  large  sum  in  exchange  for  a 
sidepayment.   The  inclusion  of  a  third  party  may  help,  however,  to  the 
extent  that  it  makes  renegotiation  more  costly,  e.g.  because  it  is  known 
that  the  third  party  will  be  "unavailable"  at  a  crucial  moment  during 
the  renegotiation  process. 

It  should  be  noted  that  third  parties  have  uses  beyond  their 
ability  to  make  renegotiation  more  difficult.   A  third  party  can  act  as 
a  financial  wedge  between  the  initial  contracting  parties,  so  that  the 
amount  the  seller  receives  in  a  particular  state  (p   or  p  )  differs  from 
the  amount  the  buyer  pays,  with  the  third  party  making  up  the 
difference.   Also  whenever  actions  or  states  are  observable  but  not 
verifiable,  it  may  be  possible  to  get  the  initial  parties  to  reveal 
their  information  to  outsiders  by  inducing  them  to  make  reports  to  the 
third  party,  with  a  penalty  due  if  their  reports  don't  match  (equilibria 
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other  than  the  truth-telling  one  may  be  a  problem  here).   A  difficulty 
with  either  of  these  arrangements  is  that  there  may  be  a  great  incentive 
for  two  of  the  three  parties  to  collude,  e.g.,  one  of  the  initial  two 
parties  can  deliberately  report  the  wrong  information,  having  agreed 
(secretly)  with  the  third  party  to  divide  up  the  penalty  that  will 
result.   If  such  collusion  is  possible,  it  can  be  shown  in  the  present 
context  that  a  three  party  contract  offers  no  advemtage  over  a  two  party 
one  (see  Hart-Moore  (1985),  Eswaran-Kotwal  (1984)). 

In  our  earlier  discussion,  we  mentioned,  but  did  not  analyze,  the 
possibility  that  the  parties  might  send  (verifiable)  messages  to  each 
other  at  date  1,  reflecting  their  jointly  observable  investment 
decisions,  with  the  contract  specifying  how  final  prices,  p  emd  p  , 
should  depend  on  these  messages.   It  should  be  noted  that  the  use  of 
such  messages  does  not  allow  the  first-best  to  be  achieved  in  the 
original  example  of  Figure  1,  at  least  if  renegotiation  is  possible. 
This  is  because  if  v  =  9,  c  =  7,  trade  will  occur  at  date  one  at  some 
price,  p  '  say  (which  will  depend  on  the  messages  sent).   Hence  to  make 
a  deviation  from  the  first-best  I_  =  I^  =  H  unprofitable  for  the  buyer, 

D       o 

we  must  have 

10  -  p   -  1.9  >  9  -  p' 
where  p   is  the  trading  price  when  v  =  10,  c  =  6.   On  the  other  hand,  to 
make  it  unprofitable  for  the  seller,  we  must  have 

p^  -  6  -  1.9  >  p^'  -  7. 
These  inequalities  are  inconsistent. 


This  assumes  that  the  contract  cannot  be  renegotiated  at  date  1. 
However,  even  if  renegotiation  is  possible,  the  buyer's  deviation  will 
be  unprofitable.   This  is  because  the  renegotiated  price  for 
satisfactory  input.  P.,',  will  satisfy  p  '  >  p   +  7,  and  hence  the 
buyer's  net  profit  if  he  deviates,  9  -  p  '  <  8.1  -  p  . 


Mention  should  also  be  made  of  a  theory  of  damages  developed  by  Diamond 
and  Maskin  (1979).   Diamond  and  Maskin  consider  a  situation  where  a 
buyer  and  seller  plan  to  trade  with  each  other,  but  recognize  that  it 
may  be  efficient  in  some  states  of  the  world  for  one  of  them  to  trade 


123 


instead  with  another  party;  for  instance,  the  seller  may  find  another 
buyer  with  a  higher  willingness  to  pay.   Under  these  conditions,  the 
buyer  and  seller  can  use  the  breach  damages  in  their  initial  contract  as 
a  way  of  extracting  surplus  from  this  new  party.   For  example,  the 
bargaining  position  of  a  new  buyer  will  be  weakened  if  he  must 
compensate  the  seller  for  breaching  his  contract  with  the  original 
buyer.   (This  argument  assumes  that  the  new  party  cannot  negotiate  ex- 
post  with  the  buyer  and  seller  together  to  waive  the  damage  payment.) 
This  idea  has  been  used  in  an  interesting  paper  by  Aghion  and  Bolton 
(1985)  to  explain  how  long-term  contracts  can  deter  entry  in  an 
industry. 

10.    The  role  of  uncertainty  about  v  and  c  in  determining  reputation  has  been 
investigated  by  Thomas-Worall  (1984). 
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